In 2005, one of Aaron Schatz’s so-called “Hilbert problems” for quantitative football analysts concerned the difficulty of projecting college players into the NFL. Ten years later, academic studies have only been able to (weakly) predict players’ draft slot, and non-academic studies have only been able to (weakly) predict players’ future NFL performance.
In December’s issue of the Journal of Quantitative Analysis in Sports (JQAS), the Wharton School’s Jason Mulholland and Shane Jensen (M&J) added to our knowledge base by using both Combine-related variables and college performance-related variables to project the future NFL performance of college tight ends (TEs). Unfortunately, the study is behind a paywall and has only sparsely been promoted outside of academic circles. Therefore, for the benefit of the non-academic population, I’m going to go over the important details of M&J’s research today. In Part 2, I’ll identify those details that I liked or disliked, offer some general observations about their findings, and use their models to project the 2015 TE draft class.
The Basics of M&J’s Methods
To their credit, M&J attacked the problem of projecting college TEs about as comprehensively as possible, testing five outcomes and 17 predictors via a pair of statistical techniques that come from different academic disciplines. Specifically, their five outcomes were
- Draft slot
- Games Started
- Career Score, which adjusts receiving yards based on Chase Stuart’s finding that a passing touchdown is worth 19.3 yards.
- Career Score per Game
- Career Approximate Value, which was created by Pro Football Reference founder Doug Drinen
M&J’s predictors included
- Combine size variables: Height (Ht), Weight (Wt), and Body-Mass Index (BMI)
- Combine event variables: Forty-yard dash (40), bench press repetitions (Bench), vertical leap (Vert), broad jump (Broad), 20-yard shuttle run (Shuttle), and three-cone drill (3CD)
- College performance variables: BCS indicator, Receptions (CollRecs), Yards (CollYds), Yards per Reception (CollYds/Rec), and Touchdowns (CollTDs), as well as three “recency” metrics representing the proportion of total CollRecs, CollYds, and CollTDs that TEs accumulated during their final year (FY%).
M&J’s statistical techniques were stepwise ordinary least squares (OLS) regression and decision tree learning.1 Again, using two types of analyses that come from highly divergent schools of statistical thought is something for which M&J should be applauded.
Basics of M&J’s Results
Author’s note: Throughout this entire section, I’m only going to refer to significant stepwise OLS results at the p<0.10 level. For instance, College TDs having a p-value of 0.23 in their Career Score analysis makes me highly skeptical of its practical importance, no matter that their specific stepwise procedures ended up including it in the model.
M&J’s stepwise OLS had an R2 of 0.23 and found that taller Ht, a faster 40, more Bench reps, and more CollYds predicted getting selected earlier in the draft. Their decision tree (R2 = 0.35) found these same metrics to be influential, along with Vert, Wt, and 3CD. The most extreme draft outcomes in this latter analysis comprised the following predictor combinations:
- Highest-drafted: 797 or more CollYds, a 40 faster than 4.70s, and a Wt of 252 lbs or more.
- Lowest-drafted: Fewer than 797 CollYds, a Vert lower than 34″, and fewer than 19 Bench reps.
M&J’s stepwise OLS had an R2 of 0.28 and found that lower Wt, higher BMI, longer Broad, higher CollYds/Rec, and higher CollRecs predicted more NFL starts. Their decision tree (R2 = 0.27) included Wt, BMI, and Broad, but also found CollYds, Shuttle, Bench, and BCS to be influential. The most extreme games started outcomes comprised the following predictor combinations:
- Most starts: 797 or more CollYds, a Broad of 112″ or more, a Wt of 247 lbs or more, and attendance at a BCS school.
- Fewest starts:: Fewer than 474 CollYds, a BMI lower than 34.7, a Shuttle of 4.14s or slower, and fewer than 26 Bench reps.
M&J’s stepwise OLS had an R2 of 0.26 and found that a faster 40, a longer Broad, more CollYds/Rec, more CollRecs, and more CollTDs predicted a more productive NFL receiving career. Their decision tree (R2 = 0.35) included 40, Broad, and CollRecs, but also found CollYds, Wt, and Shuttle to be influential. The most extreme Career Scores comprised the following predictor combinations:
- Best TEs: A 40 of 4.69s or faster, a Broad of 120″ or more, and at least 65 CollRecs.
- Worst TEs: A 40 slower than 4.69s, fewer than 1,093 CollYds, fewer than 58 CollRecs, and a Shuttle of 4.30s or slower.
Career Score per Game
M&J’s stepwise OLS had an R2 of 0.25 and found that a faster 40, more Bench reps, a longer Broad, attending a BCS school, and more CollYds predicted more per-game receiving production at the NFL level. Their decision tree (R2 = 0.31) included 40 and CollYds, but also found CollYdsFY% and CollTDsFY% to be influential. The most extreme Career Scores per Game comprised the following predictor combinations:
- Best TEs: At least 1,079 CollYds and a CollYdsFY% of 55.4% or more.
- Worst TEs: Fewer than 947 CollYds and a 40 of 4.66s or slower.
Career Approximate Value
For this outcome, M&J only performed a stepwise OLS, which had an R2 of 0.31. In this model, more CollYds, a faster 40, a longer Broad, and bigger size (i.e., higher Ht, Wt, and BMI) predicted higher Career Approximate Value.
DT : IR :: TL : DR
Mullholand and Jensen (2014) used multiple statistical techniques to gauge the influence of multiple size, Combine, and college predictors on multiple NFL outcomes at the TE position. Taking a holistic view of their results, here are the factors for TE success that they found to be common between models:
- A faster 40 means getting drafted earlier, and ending up with both more receiving production and a higher Career Approximate Value in the NFL.
- A longer Broad means more games started, a higher Career Score, a higher Career Score per Game, and a higher Career Approximate Value.
- More CollYds/Rec and CollRecs means more games started and a higher Career Score.
If you’re not familiar with how these work, click the links. ↩