Gustavo De Mari Pereira

MSc. Computer Science | University of Sao Paulo

Ranking Counter-Strike 2 teams using Bradley-Terry model | Gustavo De Mari Pereira

Ranking Counter-Strike 2 teams using Bradley-Terry model

December 29, 2024

Introduction

When I was younger, I enjoyed playing Counter-Strike (CS) with my friends. I started with version 1.5, then moved on to 1.6, and later played CS:GO. Nowadays, I’m more of a spectator, though I occasionally analyze team performance data for fun.

I’m currently researching Reinforcement Learning (RL) and recently explored Reinforcement Learning from Human Feedback (RLHF). While reading about reward models used in the ‘post-training’ phase of large language models (LLMs), I discovered that one key approach is the Bradley-Terry model. This model is commonly used in sports like basketball, soccer, tennis, and even chess.

To better understand how reward models work, I delved deeper into the Bradley-Terry model and considered applying it to real-world data from e-sports, like CS2.

The Bradley-Terry model works by making pairwise comparisons between items and assigning a score that reflects the preference of one item over another ($i \succ j$). For example, it could represent the preference between Team 1 and Team 2, or, in the case of LLMs, between two generated responses.

For LLMs, the typical approach is to generate two responses based on a prompt and then ask a human to choose their preferred one. This process helps fine-tune the LLMs to produce responses that better align with user expectations, which is valuable since it’s difficult to define a function that evaluates response quality.

In contrast, sports have objective metrics, like the number of wins and losses to determine team preferences. This fits to the task of evaluating how CS teams performed against each other during the year of 2024.

The general steps involved on using Bradley-Terry model are the following: 1. gathering data about teams, 2. using the data to fit the Bradley-Terry model, 3. generate rankings.

Data

The first step is to gather data of winning and losses between teams. In the case of CS, I collected the number of wins and losses for each map and team that are in top 20 of HLTV ranking during the year of 2024.

HLTV top 20 teams for 2024

  team_name country_name stats kd_diff hltv_rating
1 Spirit Russia 136 +952 1.1
2 Vitality Europe 132 +777 1.1
3 Natus Vincere Europe 159 +614 1.06
4 MOUZ Europe 134 +326 1.05
5 G2 Europe 159 +404 1.05
6 The MongolZ Mongolia 103 +103 1.04
7 Eternal Fire Turkey 122 +183 1.03
8 FaZe Europe 162 +125 1.03
9 MIBR Brazil 88 +61 1.02
10 Liquid Other 104 +249 1.02
11 Astralis Denmark 106 +1 1.01
12 HEROIC Europe 137 -117 1.01
13 Complexity United States 102 -228 1.01
14 Virtus.pro Russia 132 -87 1
15 FURIA Brazil 101 -199 0.99
16 BIG Germany 90 -299 0.99
17 paiN Brazil 115 -345 0.98
18 Imperial Brazil 84 -337 0.97
19 SAW Portugal 62 -340 0.97
20 Falcons Denmark 105 -593 0.95

Subset of Win/Loss matrix for HLTV top 20 teams

  Vitality Spirit G2 Natus Vincere MIBR Liquid FURIA paiN
Vitality 0 3 7 3 2 6 5 0
Spirit 4 0 4 10 1 3 4 0
G2 7 14 0 7 2 7 1 2
Natus Vincere 1 5 14 0 0 5 2 4
MIBR 0 2 0 1 0 1 2 12
Liquid 2 2 5 5 2 0 9 2
FURIA 0 1 0 2 0 3 0 2
paiN 0 0 0 0 11 0 1 0

Bradley-Terry model

Using the Win/Loss data, we can fit the parameters of Bradley-Terry model using maximum likelihood estimation (MLE).

There is a iterative formula to do that:

$p_i = \frac{\sum_j w_{ij}}{\sum_j {w_{ij} + w_{ji}/(p_i + p_j)}}$

We start with a initial guess like: $p_i = 1/N, \forall i \in {1, 2, …, N}$ and apply the iterative formula.

For each iteration, we standardize the scores to satisfy $\sum_i p_i = 1$:

$p_i = \frac{p_i}{\sum_i p_i}$

After some iterations, we arrive to the final scores and we could obtain a ranking.

Ranking

This is the final scores and the ranking for the top 20 HLTV teams of 2024 using Bradley-Terry model.

Interestingly, it puts the major winner Spirit in the 1st position.

rank team_name score
1 Spirit 0.235284
2 Vitality 0.166654
3 Natus Vincere 0.130508
4 G2 0.0786501
5 MOUZ 0.0673365
6 Liquid 0.0532813
7 FaZe 0.0430689
8 Virtus.pro 0.0262954
9 MIBR 0.0248231
10 paiN 0.0240706
11 The MongolZ 0.0239945
12 Astralis 0.0228869
13 Eternal Fire 0.0211036
14 HEROIC 0.0178517
15 FURIA 0.0159465
16 Complexity 0.0140515
17 SAW 0.0134657
18 Falcons 0.00873056
19 Imperial 0.00680508
20 BIG 0.00519306

To calculate the probability of team i winning team j, we could use the following formula: $Pr(i \succ j) = \frac{p_i}{(p_i + p_j)}$. For example, $Pr(\text{Spirit} \succ \text{G2}) = \frac{0.235284}{(0.235284 + 0.0786501)} = 0.749469395$

Conclusion

The Bradley-Terry model is very versatile, he can be used in traditional sports but also in e-sports and even LLMs. Furthermore, he is simple to understand and could be a valuable tool to assess team performance in e-sports like CS.

References

[1] M. E. J. Newman, “Efficient Computation of Rankings from Pairwise Comparisons,” Journal of Machine Learning Research, vol. 24, no. 238, pp. 1–25, 2023.

[2] R. A. Bradley, “14 Paired comparisons: Some basic procedures and examples,” in Handbook of Statistics, vol. 4, in Nonparametric Methods, vol. 4. , Elsevier, 1984, pp. 299–326. doi: 10.1016/S0169-7161(84)04016-5.

[3] L. B. Anderson, “Chapter 17 Paired comparisons,” in Handbooks in Operations Research and Management Science, vol. 6, Elsevier, 1994, pp. 585–620. doi: 10.1016/S0927-0507(05)80098-2.

[4] H. Turner and D. Firth, “Bradley-Terry Models in R : The BradleyTerry2 Package,” J. Stat. Soft., vol. 48, no. 9, 2012, doi: 10.18637/jss.v048.i09.

[5] C. Huyen, “RLHF: Reinforcement Learning from Human Feedback,” Chip Huyen. Available: https://huyenchip.com/2023/05/02/rlhf.html