Over a year ago, we published an article that looked at the place of analytics and “advanced stats” in the beautiful game. Since then, not a whole lot has been done to see new metrics widely accepted in the game. We aim to take a look at why that is.
Take a look at the stats a viewer sees during a broadcast: goals, shots, shots on target, saves, fouls, cards, corners, pass accuracy, shot accuracy. For players, the distance ran is measured. And every one of these stats has something in common: they’re objective measures. Can anyone look at the numbers for these measures and say “that’s incorrect”? The answer, obviously, is no. It’s easily measurable what the correct number of shots on target is for a match.
As we alluded to in our previous article, it’s certainly difficult to see how a group of players fare together when only 3 substitutes are permitted per match. But what has become increasingly more obvious is that the real cause of new stats not being widely accepted is their subjectivity.
Baseball, Hockey, and American Football have shown an increased emphasis on stats in recent years. Some fun stats include: BABIP in baseball, YPRR in American Football, and Corsi in hockey. We’re going to take a brief look at each of these and why they make sense.
BABIP stands for Batting Average on Balls In Play. A player averages 1.000 on home runs and 0.000 on strikeouts. Removing these events, what does the player average? That’s BABIP. It’s intended to represent whether a player has been lucky. But, it’s accepted and used, because there is easily measurable criteria. While it’s a “made-up” stat, anyone who uses it measures it the exact same way. And, it goes beyond traditional batting average to offer additional insight. That’s what an “advanced stat” is meant to do.
YPRR stands for yards per route run. Yards per catch (or reception) is a common measure, but YPRR is meant to look at plays where the receiver didn’t catch a pass. If a player runs out for a pass twice and catches 2 passes for 20 yards, his YPC is 10 and his YPRR is 10. If a player runs out for a pass 5 times and catches 2 passes for 20 yards, his YPC is still 10, but his YPRR is 4. YPRR is meant to capture the efficiency of the player. But again, it’s countable, and it’s not subjective. And that’s a big reason why it works.
Lastly, hockey has a relatively new stat called Corsi. It counts the difference between the number of shot attempts by a team while a player is on the ice and the number of shot attempts by an opposing team while the player is on the ice. The shots don’t necessarily have to be on target (they could also be blocked), but overall it shows how much more dangerous a team is with a given player on the ice. And yet again, it’s countable and not subjective. It’s just a different and deeper way to look at numbers. It’s an “advanced stat”.
New Stats in Football
Football doesn’t yet have widely accepted measures such as this, and until somebody comes up with a useful stat, things will not change. But that’s the path that people need to head down. There are a lot of great, useful stats out there. xG is a great one, and it is certainly valuable. xG (expected goals) looks at where shots came from and the likelihood of a goal. The likelihoods are statistically determined, and the stat is very good at showing who has been lucky or unlucky. Michael Caley has done an excellent job modeling xG.
And while the stat certainly has validity, the shooting areas and goal likelihoods are not objective measures. Those two things were determined through an opinion, and as such, it is unlikely to become a widely accepted statistic. It’s not because it isn’t valuable (it is!), but because everyone can measure these numbers in a different way.
There are stats out there waiting to be discovered, and a vast majority of them are going to be discovered through player location data. What players are doing on the pitch is going to tell you who is and who is not performing well. Here are some ideas for numbers that are not objective:
Targets: number of times a player is the intended recipient of a pass
Targets/90: number of targets divided by minutes played times 90
Average Target Distance: the average distance of a pass when a player is targeted
Receiving Percentage: the percentage of successful passes when a player is targeted
Now, whether metrics like these are useful at all is TBD, but other than some possible debate about who the intended recipient was, these measures are clear and objective. Nobody is going to say “well you could calculate that in a different way”. You can’t. And only with everyone in agreement will new, advanced stats become commonplace in football.
image source: Harry How/Getty Images