We wanted to provide the public an inside look at how our rankings are composed. While the WGF rankings are still in the early stages, we feel as if we have made significant progress when it comes to addressing gaps in the formula. The WGF Rankings have only been published for a little over a year, and through some trial and error, we believe that these rankings are currently the best predictor of future matches that exists.
One of the most important things in creating a rankings system is understanding how it works. Before we get into those details we’re going to give a little overview of some of the decisions we’ve made prior to even starting to rank the teams.
For reference, here are the rankings
1. We currently rank 221 teams. This includes the 209 teams which FIFA ranks as well as 12 additional teams that play relatively frequently in tournaments or in friendlies against these 209 teams. Although a team like Martinique is not eligible for World Cup Qualifying, they frequently play opponents that are in our rankings. To ensure better accuracy, it is important to include these games.
2. We only include the most recent 18 months of matches. Early in the development stages, it was determined that old matches need to be significantly devalued. FIFA includes matches from 4 years ago. Elo uses matches from 100 years ago. We think that’s irrelevant. A cutoff had to be determined, and we felt that 18 months had far more reliable data than using 12 months. The difference between 18 and 24 is negligible, so we chose to use the more recent data.
3. No adjustment is made for different levels of competition. This is probably the most controversial assumption. While friendlies are largely viewed as “exhibition” matches, the number of matches in which a team is unable to perform at a level remotely resembling their true form is very small. We put everyone on a level playing field and let the data dictate the results. Do we probably over-value some teams? Yes. Do we probably under-value some teams? Yes. Do we have a better reading on some teams because of this assumption? Absolutely.
4. Ranking Imputation is used for teams which have played 3 or fewer matches in the previous 18 months. This is a relatively new feature that will significantly increase the accuracy of the WGF rankings. No longer will Comoros appear in the 50s because they drew in their only match against a team in the 50s. While we only use 18 months of data, we have over 3 years of match data at our disposal. If you can’t play 1 match every 6 months, we use your all time calculated rank as your current rank. We don’t feel that’s asking too much.
5. Our final assumption is Home Advantage. Additionally through some trial and error, we believe that Home Advantage is worth 0.55 goals. Using this number implies that teams who are equal on a neutral site would be likely to each win should the match be played at home.
The We Global Football rankings are made up of 5 components. 3 of these are easy metrics to calculate, while 2 others are unique, recursive metrics.
1. Percentage of Points Obtained:
A pretty straightforward calculation. 3 points for a win, 1 point for a draw, and 0 points for a loss. The sum of these points divided by the maximum points attainable (3 x Matches Played) equals a team’s Percentage of Points Obtained.
2. Average Goal Differential
Another straightforward calculation. Goals Scored in all matches minus Goals Allowed in all matches is a team’s cumulative goal differential. Divide that by the number of matches played to get the Average Goal Differential.
3. Home Field Advantage Factor
Home matches get a value of 1. Neutral matches get a value of 0.5. Away matches get a value of 0. The sum of these values for each match divided by the number of matches played is a team’s Home Field Advantage Factor.
4. Strength of Schedule
This measures the quality of the opposition. Unlike a typical strength of schedule metric, ours is a little more complex. The whole purpose of a strength of schedule metric is to properly evaluate the level of competition. That cannot be done by simply looking at percentage of points obtained, as most strength of schedule calculations do. We feel that the best evaluation of the opposition is the rankings themselves. If you played a team ranked #20 in the WGF rankings, that is certainly more difficult than playing a team ranked #100. Just because team #100 may have a higher percentage of points obtained, doesn’t mean that they were the tougher opponent.
This may seem a little confusing. Strength of Schedule is part of the calculation of the rankings, but it is also calculated based on the rankings? Isn’t that a circular reference? If this were the only metric being used, then yes, it would be a circular reference. But since there are multiple other factors that go into calculating the rankings, the numbers are able to converge and stabilize.
We also do not use a straight average in the strength of schedule calculation. Instead, we have chosen to use a geometric average and feel the results are more accurate. A geometric average multiplies X numbers together and then raises it to the (1/X) power. For example, if you play the #1 and #100 team, the calculation is (1 x 100) ^ 1/2 = 10. A straight average would be (1+100)/2 = 50.5. Why do we do this? It’s a matter of preference. We feel that if you play the #1 and #100 team, that means you have played a tougher schedule than if you played the #40 and #50 team. A straight average would disagree. The geometric average of a team’s opponent’s WGF rankings is our Strength of Schedule.
5. Win Strength
This is the most important of all metrics and correlates the highest with the overall rankings. This is a formula that is also a recursive calculation based on teams you have played to date. Currently there are 221 teams in our rankings. If you have beaten the #1 ranked team you get 221 x 3 = 663 points. If you draw with the #1 team you get 221 x 1 = 221 points. This is a declining score, so if you beat the #221 team, it is only worth 1 x 3 = 3 points. Losses work in reverse. A loss to the #1 ranked team is only -1 x 3 = -3 points, while a loss to the #221 team would be worth -221 x 3 = -663 points. The average of the sum of these points is a team’s “Win Strength”.
Again, this is another metric that is calculated based on the rankings, but also goes into the rankings. Did you beat good teams? Did you lose against someone you shouldn’t have? Unlike FIFA, we do not believe that all losses are created equal, and neither should you.
It’s not easy to describe, but we’ll try to do the best we can without giving too much away. We take 4 of the above components (excluding home advantage) and blend them using varying degrees of exponential weighting. Why exponents? Because exponents are fun! No, actually we don’t believe that the correlation between rankings and metrics is linear.
The difference between having a +1 Average Goal Differential and a 0 Average Goal Differential is far greater than the difference between a -6 Average Goal Differential and a -7 Goal Differential. This same concept applies to other metrics and is used to calculate an overall ranking. Once that ranking is calculated, we adjust it using the home advantage factor.
Let’s say for example a team has a ranking of 98 and has played all of their matches at a neutral location. The team ranking will not be adjusted. Now let’s say a different team has a ranking of 98 but has played all of its matches at home. Given that home advantage is worth 0.55 goals, that team’s true ranking should probably be 97.45. The same works in reverse for road matches. This is one of the reasons why we continuously value Finland higher than many other rankings. Finland’s Home Field Advantage Factor is currently 0.29, which means they’ve produced their high ranking despite playing many games away from home. Their true ranking is therefore adjusted upward to account for that difficulty.
Once the final ranking is calculated, we recalculate everything until the numbers stabilize. Usually after 10 or so re-calibrations, the numbers no longer change and the overall rankings are produced.
How Can the Rankings be Improved?
The assumptions and calculations made really boil down to the weighting of the metrics. How much do you want to value Win Strength vs. Average Goal Differential vs. Strength of Schedule? As time has gone on, we’ve valued strength of schedule more and more. Currently 21 of our top 32 teams are from UEFA, and that’s quite hard to disagree with in reality. We make no assumptions about what confederation a team in, and that has absolutely no bearing on the calculations at all.
The real test of a confederation’s strength is when confederations play against each other, which ironically occurs most often during the hated friendlies. If you have 5 UEFA teams against 5 AFC teams, and all 5 UEFA teams win, logically the teams that have played the UEFA teams previously have a tougher Strength of Schedule than those who have played the AFC teams. There is no “Confederation Factor” that is used to manipulate the rankings.
We feel that what matters is: Who did you play? Where did you play? What was the final score? It’s just a matter of how you put those pieces together.
Agree or disagree with the way we’re doing it? Let us know in the comments below or contact us via Twitter @We_Global. Thanks for reading!