With the 2013 football season behind us, I’ve been spending my weekends developing a Bayesian model of the NFL with my college friend and UT math phd candidate Chris White.
To generate a Bayesian model, one first comes up with a parametric model that could generate the observed data. For simple problems, one uses Bayes rule to calculate the probability of parameters equaling certain values given the observed data. For complicated models, this is an extremely complicated task, but there are some monte carlo methods, such as Gibbs Sampling, which produce satisfactory approximations of the actual distributions. There are R packages available that make Gibbs Sampling fun.
I want to touch on many of these topics later in more in-depth posts, but first, here’s some results.
As of the end of the 2013 regular season, this is how our model ranked the NFL teams.
These are the mean estimates of our Bayesian model trained on the 2013 NFL regular season. Our model predicts that, on a neutral playing field, each team will score their oPower less their opponent’s dPower. The former is the mean of a poisson process and the latter is the mean of a normal distribution. Homefield advantage is worth an additional 3.5 net points, most of which comes from decreasing the visitor’s score.
So we predict the Rams would beat the Lions by a score of 27.4 to 25.8 on a neutral field. Since we know the underlying distributions, we can also calculate prediction intervals.
Here’s the same model trained on both the regular season and the post-season. The final column shows the change in total power.
I have a list of ways I want to improve the model, but here’s where it stands now. Before next season, I want to have a handful of models whose predictions are weighted using a Bayesian factor.
I’m very excited about this project, and I’ll continue working on it for a while.