Skip to main content
Advertising

Next Gen Stats: New advanced metrics you NEED to know for the 2020 NFL season

The Next Gen Stats team is excited to debut a series of new advanced metrics for the 2020 NFL season. As the NGS toolbox of advanced metrics grows each season, our ability to analyze the game takes new leaps. Data storytelling is about to get even more compelling.

Our team continues to push the envelope of what's possible. With advances in machine learning, we have developed new metrics in a variety of different aspects of the game. From predicting the number of rushing yards a ball carrier will gain from handoff (Expected Rushing Yards) to estimating the chances a team wins in the middle of a game (Live Win Probability), there's a lot to look forward to from Next Gen Stats in 2020.

Here's a quick glance at some of our new stats to watch out for this season:

Expected Rushing Yards

When it comes to quantifying the running game, isolating the performance of an individual ball carrier from the contributions of the offensive line, scheme and situation is a challenging task. The new NGS Expected Rushing Yards model will bring new insights to an area of the game that's been lacking in contextual analysis.

On the individual play level, we can estimate how many rushing yards a ball carrier will gain from the moment of the handoff and the likelihood the player gains a first down or scores a touchdown. A statistical breakdown of Nick Chubb's 88-yard touchdown run against the Ravens last season gives us a glimpse of what's possible with Expected Rushing Yards:

At the game, season and multi-season level, we can compare a running back's efficiency relative to the rest of the league, find game situations and play characteristics where players excel or struggle, and break down O-line performance and coaching schemes.

In our broader explainer article for this metric published in July, we went deep into the background and methodology behind Expected Rushing Yards, and how two data scientists from Austria, winners of the 2020 Big Data Bowl competition, established the basis of the final NGS model.

Route Recognition

Conventional counting stats -- like receptions and receiving yards -- provide a way to measure an individual player's ability to catch and move the football, but they only tell part of the story. Advanced stats -- like depth of target, separation window and completion probability -- provide greater insight, but they still leave out an important factor. Namely, which route did the pass catcher run to get open before catching the ball?

The list of questions that can be answered from a model that can classify routes is abundant:

Who were the best-performing wide receivers by route type? Best-performing quarterbacks by route? Does a specific receiver have a tendency to run certain routes relative to the situation?

We can analyze speed metrics relative to route types, assess how well a receiver gets open, explore how a defense (or individual defender) performs against specific routes -- the list could go on.

NFL.com's Nick Shook recently broke down the best wide receivers from the 2019 season by route type. What were the key findings? Saints superstar Michael Thomas dominates underneath routes:

A full breakdown of the Route Recognition modeling methodology, initial insights into route analysis and a look at the most and least versatile route runners from 2019, can be found here in our August explainer article.

Live Win Probability

Predicting the outcome of an NFL game has traditionally been a task done in the days leading up to game day, before the two teams actually take the field. The rise of football analytics in the public space has led to the rise in win-probability models -- that is, estimating the likelihood a team wins while the game is still in progress.

The NGS team has developed our own Live Win Probability model that evaluates the likelihood of either team winning at any moment between plays in the game. The model, trained on every historical play in the last 10 seasons, looks at the score differential, down-and-distance, time remaining, timeouts remaining, expected points and team quality. New for the 2020 campaign: Fans will be able to track Live Win Probabilities for every game this season at nextgenstats.nfl.com.

There are several ways to interpret metrics from a win-probability model beyond single probabilistic prediction. Win Probability Added (WPA) -- the difference between a team's win probability before a play and after a play -- gives us a new metric to evaluate the play's influence on the outcome of the game. We can analyze coaching decisions -- like going for it on fourth down or kicking a field goal -- to better understand the opportunity cost of specific strategies.

One of the key features of our Live Win Probability model, Expected Points, represents a per-play standardized value to quantify the success of a play. Prior to completing our Live Win Probability model, we first had to build an Expected Points model that uses historical data to determine the (expected) number of points that the team will eventually score on the drive given the current game situation. That is, instead of quantifying the success of a play by the number of yards gained, we can estimate the success of a play by Expected Points Added (EPA).

Expected Points was first introduced by former 49ers quarterback Virgil Carter in 1971, made popular by Bob Carroll, Pete Palmer and John Thorn in 1988's The Hidden Game of Football, and expanded upon by Brian Burke in 2014 and, most recently, Ron Yurko, Samuel Ventura and Maksim Horowitz in 2018.

Expected Points has applications beyond win probability. Using Expected Points Added as an all-encompassing measure of play success, aggregate metrics (like Total EPA or EPA per play) can give us a more accurate representation of team or player performance than the traditional box score.

As you can see in the above tweet, the last four quarterbacks to lead the league in Total Expected Points Added in a single-season won the NFL Most Valuable Player Award in that same campaign. "Leading the league in passing" is often a reference to the quarterback with the most total passing yards. Perhaps it's time to reconsider and instead spotlight the QB with the most Expected Points Added.

Field Goal Probability (2020 update)

Last season, we rolled out a Field Goal Probability model to estimate the likelihood of a made field goal, given the distance of the kick, weather and stadium type. The model used logistic regression to predict the probability of a successful field goal.

This year, we've made updates to the model. Using a modeling technique called ensembling, our new and improved Field Goal Probability model uses a blended estimate of the same logistic regression model used in 2019, with an additional xgboost model (non-linear, tree-based method) that does a better job of predicting longer field goals (55-plus yards) than the logistic regression.

By controlling for the level of difficulty of each kick, we can better contextualize kicker performance relative to expectations. At the season and multi-season level, the difference between a kicker's actual field goal percentage and expected field goal percentage is equivalent to a measure of field goal percentage over expectation (FGOE).

How can we apply field goal probability beyond individual kicker evaluation? The ability to measure field goal probability in real time enhances our new Win Probability model in late-game situations -- when the probability of making a kick in a given situation has a profound effect on the likelihood of winning the game.

Expected Yards After Catch (2020 update)

Our Expected Yards After Catch model, which we originally debuted in 2018, will be replaced with the same modeling structure as the Expected Rushing Yards model. Just like on run plays, the new EYAC model (combined with Completion Probability) will also have the ability to estimate outcome probabilities like first downs and touchdowns, in addition to a single point estimate.

Keep an eye out this season for a weekly video across NFL digital platforms featuring Most Improbable Touchdowns of the Week -- which combines Completion Probability and Expected Yards After Catch for pass plays and Expected Rushing Yards for run plays.

The Next Gen Stats LIVE Experience (2020 update)

The dots are back! Available in select games for the 2020 season, the Next Gen Stats LIVE Experience gives fans a new way to follow the action on game day with an exclusive second-screen experience.

In addition to the live dots field view debuted last season, new features within the experience include Live Win Probability charts, visual drive charts, big-play highlights and enhanced advanced metrics available nowhere else.

That's not all! We have added new statistics like dropped passes to improve passing and receiving metrics. We can measure and aggregate distance and speed metrics that include the time in between plays. And we can derive new fantasy football metrics, like Expected Fantasy Points, using Completion Probability, Expected YAC and Expected Rushing Yards to quantify a player's actual fantasy performance relative to expectations.

When we first began tracking players in 2015, speed and distance measures were the primary stories derived from Next Gen Stats. Just six seasons later, our advanced metrics toolbox now includes several machine learning models spanning techniques like logistic regression, boosted trees and convolutional neural networks.

This is only an introduction to these new advanced metrics. There is plenty more to come from the Next Gen Stats team this season. Stay tuned!

-- Mike Band, Next Gen Stats Analyst. Follow Mike on Twitter @MBandNFL.

Related Content