Blog, Data science, Diverse

Super Bowl LVII: Why the predictions are probably wrong

09 Feb 23

57th Super Bowl: It’s all a question of probability

This weekend, the 57th Super Bowl – the final of the American Football League (NFL) and one of the biggest sports events of the year – will take place. The question of whether the Kansas City Chiefs or the Philadelphia Eagles will emerge victorious is, of course, the subject of much speculation and betting in the week leading up to the game.
However, precisely predicting the outcome of a single game is very difficult even with the most modern methods, as it is influenced by countless and often uncontrollable factors. Any serious prediction of the outcome of a game will therefore not make any certain statements, but will only generate probabilities of different outcomes based on a certain data basis and model. Such probabilistic predictions are the most useful tool not only in football, but in other sports (as for example in our simulation of the 2020 European Football Championship).
When determining the probabilities of various outcome scenarios of the Super Bowl, there are two levels of difficulty.

Level 1 – Predict winner

The first level consists of “only” modeling the win probabilities of the two teams and thus predicting the winner. These win probabilities depend primarily on the playing strengths of both teams – the greater the difference, the clearer the expected outcome. Playing strengths in head-to-head sports such as American football, soccer or even chess are typically quantified using the so-called “Elo score”. In this process, Elo points are transferred from the loser to the winner after each match. For further matches, the probabilities for victory or defeat are then calculated from the current point differences.

On the basis of all past encounters in the NFL, prediction specialists from fivethirtyeight.com have determined the current Elo scores of the Kansas City Chiefs and the Philadelphia Eagles and calculated the probabilities of victory for both teams. Result: The Chiefs win with a probability of 57%. So if the Super Bowl were to be played 100 times in a row, Kansas City would be expected to win 57 of the games.

However, this Elo-based model makes no statement about the expected score at the end of the Super Bowl.

Level 2 – Predict Superbowl LVII score

Even more multifactorially influenced, even more random and therefore even more difficult to predict is the exact game score. Nevertheless, there are some stable factors also in this second stage, which can be used for a probability assessment:

First, not all outcomes are possible under the rules of American football: for example, a 1:0 outcome has zero probability, since no play that is not preceded by a 6-point touchdown is rewarded with only one point.
Second, there are practical limits to the outcome of games: a result of 1000:0 would be theoretically possible, but is enormously unlikely in a 60-minute game. A naive assessment of which outcomes are how frequent and therefore how likely can be derived from the league’s historical game outcomes (for example, the most frequent NFL result in the last 50 years was 20:17, followed by 27:24 and 23:20).
Third, the teams have quantifiable characteristics: For example, if we were dealing with the two most attacking teams in the NFL, we would expect a different result than if it were the two most defensively minded teams. These factors – offensive and defensive strength – can be assessed in terms of points scored and points allowed historically.

We combined these three components in a simulation: Based on all NFL games of the last 50 years, we determined the distribution of the final scores, quantified the offensive (distribution of points scored in the last five years) and defensive (distribution of points allowed) qualities of both finalists and finally played through the Super Bowl 100,000 times.

57th Super Bowl simulation

Our simulation thus resulted in 100,000 possible game outcomes, in which we observed the following:

The Kansas City Chiefs won 52.4% of all games – this probability is close to the 57% modeled by FiveThirtyEight.
The probability distributions of the point wins of both teams are very similar (see Figure 1). However, according to the forecast, the Chiefs score one point more than the Eagles.
Maximizing the modeled score-probability curves of both teams, we get a point prediction of 24:23 for Kansas City.

*Abbildung 1: Venn-Diagramm der künstlichen Intelligenz*

So the expected result of the Super Bowl is 24:23 for the Kansas City Chiefs. This bet is the best from the perspective of our model, but still very probably wrong – because in only 131 of all 100,000 simulated games was 24:23 the exact score.

Validate predictions

Our prediction of the game outcome – that Kansas City will win with 52% probability – is of course also absolutely compatible with a Philadelphia win. So, especially with such a close prediction, it is impossible to validate the model on a single game outcome. To evaluate probabilistic predictions, it must be elicited over many games whether the model gives the correct probabilities on average. That is, whether an event that the model judges to be 50% probable also occurs in about 50% of the games. We discussed this validation procedure in connection with the 2021 Super Bowl.

Do you find this topic interesting or do you want to read more about the applications of predictive models in various disciplines, then browse our blog.