Earlier today, Chase Stuart posted an analysis exploring the relationship between how much draft capital NFL teams invest on WRs and how much receiving production they get from those investments. His sample included all WRs drafted between 1970 and 2007, his measure of capital investment was the total amount of draft pick value represented by the picks used to select WRs in a given year, and his measure of production was the percentage of leaguewide receiving yards that WRs in a given draft class accounted for in the first five years of their careers. Ultimately, via correlation analysis, he found that there’s no meaningful relationship between investment and return when it comes to drafting WRs since the merger.
But true to his humble-almost-to-a-fault form (which is why he’s someone whose opinion I trust), Chase stressed that his analysis was by no means perfect (“good enough,” you might say), welcoming explanations for what he found or suggestions for how to improve how he attacked the question. So far in the comments section of his post, conceptual explanations abound:
- Teams have increased their hit rate on later-round WRs over time (read: higher returns from lower investments).
- Teams have reduced their susceptibility to the sunk costs fallacy over time (read: lower returns from higher investments).
- NFL rule changes have increased WR production over time. A rising tide lifts all boats, but there are more small boats than large boats (read: higher returns from lower investments).
- Teams lacking good QBs have increasingly tended to compensate by drafting WRs earlier (read: lower returns from higher investments).
Like these commenters, I took the bait hook, line, and sinker. But my contribution to the discussion is more methodological suggestion than conceptual explanation, and Chase’s study falls into a larger category of NFL analytics that would greatly benefit from heeding said suggestion. Therefore, rather than commenting, I’ve instead chosen to make an I//R post out of it.
Time After Time
I’ll cut to the chase: The major methodological problem with Chase’s analysis — and again, this problem is widespread in NFL analytics — is that simple bivariate (or multivariate) correlations tend to fail as an actionable source of information when the data set is time-dependent. Before the 2015 season, I posted forecasts of Adjusted Net Yards Per Attempt (ANY/A), fantasy passing stats, and fantasy rushing stats based on a statistical method that takes time dependency into account. It turned out that all three came in within the margin of error, so I got that going for me, which is nice.
The same method — namely, Autoregressive Integrated Moving Average (ARIMA) — is appropriate for Chase’s data set because WR production over time is, by definition, a time series. Before getting into my reanalysis of the investment-return effect for drafted WRs, I’ll start with a methodological explanation of why Chase’s correlation-based analysis, although good enough, isn’t correct.
When analyzing a time series data set, one needs to think about whether or not one data point in the time series influences the next data point. Without delving into details, the following graphs suggest that it does: (The dotted blue lines represent the stastical significance threshold.)
What we want to see here is two-fold: 1) a rapid, continuous descent from Lag=0 to Lag=10, and 2) non-significant values for Lag>0. In both graphs, however, we see that there isn’t a rapid descent and that Lag=3 is significantly different from zero (r = 0.49). What this suggests for analytic purposes is that the data is significantly autocorrelated. In plain terms, the above graphs suggest that WR production for a given draft class depends on WR production for previous draft classes.
But why does significant autocorrelation disqualify correlation analysis as a valid statistical method? In short, it produces incorrect standard errors (a small deal that leads to incorrect inferences) and incorrect regression estimates (read: correlations; a big deal that leads to incorrect inferences). Applying this to Chase’s analysis, his correlations are biased because receiving production for a given WR draft class depends on the production of previous WR draft classes.
You Wanted The Time, But Maybe I Can’t Do Time
Based on the above, ARIMA is a more appropriate — and admittedly more advanced — statistical technique for the data at hand than correlation analysis because the effect of WR draft value on five-year WR production must account for autocorrelation. Here are the results of said ARIMA (which was a [2,1,0] for those interested):
If you looked at this and went cross-eyed, here’s the interpretation in layman’s terms:
- The first thing to do is to look at the AIC, AICc, and BIC values, which have interpretations akin to R2 in a correlation/regression context, except that lower values are better here. To that point, all three of the above values are worse (i.e., less negative) than the ARIMA that didn’t include draft value as a predictor (AIC = -172.2, AICc = -171.5, BIC = -167.5). In choosing between models, the one with draft value fits our data set worse than the one without it.
- The AR1 and AR2 terms are significantly different from zero, which means that the percentage of WR production in a given year depends on the percentage of WR production in the past two years (e.g., 2001 depends on 2000 and 1999, etc.). AR stands for “autoregression,” so you can think about these two terms as indicators of regression to the mean over time.1
- Even with this more rigorous analysis, WR draft value still isn’t a significant predictor of five-year WR production (i.e., .0000/.0002 isn’t even close to the critical significance value of 1.96).
DT : IR :: TL : DR
Although the end result was the same, what I found via an ARIMA analysis represents statistical evidence that’s more rigorous than the correlation analysis that Chase Stuart reported. That’s because ARIMA accounts for the autocorrelation between the production of a given year’s draft class and the production of recent draft classes.