For the past several years, Cantor Gaming (CG) Technologies has released point spreads for nearly every NFL regular season game several months before the first game is played, and they’ve been a boon to consumers of all things football. Team bloggers have used them to produce much-needed content during the dead period of the offseason. Bettors have used them to whet their insatiable appetite for NFL action. Meanwhile, those of us in the business of fantasy football projection have used them to inform our expectations about game scripts. Because teams run when they’re winning, we’ve used these way-too-early spreads to identify running backs on teams projected to be favorites most often; and the inverse is true of underdogs passing more.
But here’s the thing. All three of these applications depend on how closely CG’s way-too-early spreads resemble the actual spreads we end up seeing come game time this fall. Bettors benefit if the spreads are inaccurate because that creates opportunities for arbitrage. In contrast, bloggers and fantasy football players benefit if they’re accurate; the latter because more predictable spreads means more predictable game scripts, which in turn means more informed projections.
So the reliability of way-too-early spreads is an important, open question. And because reliability analysis is my wont, the rest of this post attempts to provide an answer.
I gathered CG’s way-too-early spreads, actual spreads, and actual game results from Weeks 1-16 of the past four NFL seasons; and then broke the main accuracy question into the following three sub-questions:
- Are way-too-early spreads predictive of actual spreads?
- How do way-too-early spreads compare to actual spreads in terms of predicting actual game results?
- How does the accuracy of way-too-early spreads change as the season progresses (both in terms of actual spreads and actual game results)?
This last question grows out of the simple reality that forecasting should be easier the closer in time we are to the event. If I ask you right now to predict the 49ers’ 22 starters in each of their games, you’re almost certainly going to get more correct for Week 1 than you are for Week 16. In the context of way-too-early spreads, this suggests that they’re reliable indicators if they’re both a) accurate early in the season and b) resilient to time-related accuracy drop-offs (relatively speaking, of course).
But wait, what is “accuracy” exactly? In this study, I used root mean squared error (RMSE)1. RMSE does what it says on the tin. Subtract the way-too-early spread from the actual spread, then square that difference. Next, calculate the average of those squared differences across all 960 sampled games, and then take the square root so we’re back in the unit we started with (i.e., expected margin of victory).
According to RMSE, the overall accuracy of way-too-early spreads from 2012 to 2015 was 4.05, meaning that they’re off by about 4 points from the actual spread — in either direction. Delving a little deeper, about 5 percent of way-too-early spreads nailed the actual spread on the number, 50 percent of misses were within 3 points or less, 70 percent were within 4 points or less, and 93 percent were within 7 points or less.
Maybe it’s just the skeptic in me, or maybe it’s years of having “the NFL is the toughest sport to predict” drilled into my head, but I was not expecting that level of accuracy. I just assumed that point spreads released in April or May — before training camps, before preseason; and before injuries, be they in training camp, preseason, or the regular season — wouldn’t be within a touchdown of actual spreads even 60 percent of the time, let alone over 90 percent of the time.
Of course, it’s important to remember that actual spreads aren’t measures of relative team strength, per se. They’re measures of what the betting public perceives relative team strength to be. With this in mind, perhaps the unexpectedly high accuracy of way-too-early spreads isn’t an indicator that the NFL is more predictable than I thought; it’s actually public perception that’s highly predictable. Maybe, except for obvious situations, people make initial judgments of team strength and hold on to them like grim death for the entire season. It may sound like I’m just spitballing here, but there’s actually a psychological term for this: belief perseverance.
Some additional evidence for this line of reasoning comes from how well (or poorly) way-too-early spreads predict actual margin of victory when compared to the accuracy of actual spreads. For Weeks 1 to 16 from 2012 to 2015, the RMSE for actual spreads was 13.7, while the RMSE for way-too-early spreads was 14.3. So while, again, this result was much closer than I expected it to be, the fact that the errors themselves are so large suggests that the NFL isn’t what’s more predictable than I thought; it’s public perception of the NFL that’s predictable.
DT : IR :: TL : DR
For the past several years, bloggers, bettors, and fantasy football projectors have utilized CG Techonolgies’ way-too-early point spreads to create content, find arbitrage, and predict game scripts, respectively. All three applications depend on the accuracy of those spreads, but no one’s ever tested said accuracy (to my knowledge). In this post, I did just that, and the results (so far) suggest that, although they’re remarkably reliable indicators, what they’re indicating is public perception about NFL games, not what will actually happen in NFL games.
In Part 2, I’ll add the final piece of the puzzle by showing how the accuracy of way-too-early spreads changes from Week 1 to Week 16. It makes sense that they become less accurate as the season progresses, but the more interesting question is, “By how much?”
For the stats nerds out there, yes, I could have used mean absolute error, but penalizing for large misses seemed appropriate in the current context. ↩