Understanding ‘WAR’ and its Practical Applications to Player Evaluation (Part 1)

(OSA’s WAR ‘Explainer’ – Part 1)

This article is being co-posted on Maple Leafs Hot Stove as well as on my own site, http://www.originalsixanalytics.com. Find me @michael_zsolt on twitter.

@DTMAboutHeart of Hockey-Graphs.com and I recently exchanged tweets about the ‘WAR’ metric.

Twitter Exchange

Twitter Exchange pt2

This exchange brought to my attention that, despite the great work that has been done by the creators of WAR, it is a very complex metric. As a result, many don’t know exactly what ‘Wins Above Replacement’ (WAR) is, or how it works in hockey. It also struck me that, currently, much of the hockey community falls into one of two groups:

  1. Those who don’t know about, or fully understand WAR (or similar metrics), and thus ignore them, and
  2. Those who understand WAR, but think it is not yet developed enough to fulfill the original single number dream

(As well as perhaps a third group: teams that are secretly using WAR but not telling anyone… possibly evidenced by the creators of WAR now both being employed by professional sports teams).

As such, I am writing this series as an attempt to amplify the work done in this area. My main objectives are to increase the number of people in the conversation, as well as to demonstrate some of the practical applications, and limitations, of using the WAR metric to evaluate players. I will be focusing solely on the WAR metric developed in 2014/2015 by the great team at WAR-On-Ice.com (WOI), who are sadly leaving the public hockey world on March 31st, 2016. For those who don’t know, they are providing access to their entire database’s raw data up until that date. For reference beyond March, I have also uploaded their raw WAR/GAR by season output to my own website.

Further credit is owed to Tom Awad’s work on ‘Goals Versus Threshold’, a very similar stat that the WOI ‘WAR’ metric built upon. Last, I also want to give credit to Moneypuck’s excellent 2015 ‘Building a Contender’ series, where he applied GAR to building a winning franchise, which I have cited multiple times in article below.

So – before I get into WAR itself – why should you care?

Why is WAR Important?

I don’t think anyone will disagree with the following:

  1. The purpose of hockey is to win the game
  2. A team wins the game by scoring more goals than its opponent
  3. A player’s ‘ultimate’ contribution to his team is defined by his ability to improve his team’s goal differential (e.g. to increase goals scored and decrease goals against)

Although these three things are very basic, they are the foundation of why WAR is an important metric. It is easy to evaluate players strictly on the stats everyone has readily available: Goals, Points, Assists, etc. However, the classic example of where these fall short is the high-flying scorer who gives up two shots/goals against for every one that he gets. Corsi has become very popular for addressing this through an adjusted, game state-specific plus/minus, based on shot attempts. Although Corsi and WAR are built off similar concepts, ‘WAR’ tries to take shot rate metrics like Corsi, combine them with other factors, and then tie the result directly to the column on the score-sheet that matters the most: Wins.

What is WAR?

As mentioned, ‘WAR’ is a metric that attempts to combine a player’s contributions in offensive, defensive, and other aspects of the game, into the number of ‘Wins’ he contributes to his team. A fundamental concept of WAR is that it constantly compares NHL players to a set of ‘baseline’ expectations. This baseline is similar to looking performance versus league average, though it ends up closer to the league ‘minimum’.

Baseline expectations are important because of player value, and cost: an NHL team should only pay more than a bottom-tier salary if a player is contributing more than bottom-tier results. Thus, ‘baseline’ (or ‘replacement’) level players represent the quality of player that could be acquired for relatively little salary/cost on the free agent or trade markets. WOI calls their method the ‘Poor Man’s Replacement’, as they derive it based on players who have limited NHL experience. Conceptually, these are the players called-up to fill a 3rd/4th line role when injuries require it.

Finally, in order for WAR to convert performance into wins, we must first derive ‘goals’ contributed. Eric T has shown that roughly ~6 goals = 1 win – directly connecting goals above baseline into wins. Ironically, many people have now realized that ‘Goals Above Replacement’, or ‘GAR’, is actually easier to interpret than WAR, especially for individual players. As a result, from here on I will largely focus on GAR, though you should keep in mind that the two metrics are interchangeable at the rate of 6:1.

One last comment before the data…

A Caveat on ‘Catch-All’ Stats in Hockey:

As Michael Lewis’ Moneyball has shown the world, Baseball is the perfect sport for a WAR-type metric. Hitting, fielding and pitching all arguably equate to individual skills disguised within a team game, easily allowing statisticians to separate out individual contributions versus context, and noise.

Hockey, on the other hand, is a sport where it is very difficult to create a single statistic that will summarize ‘all’ of a player’s contribution in one number. As illustrated in this helpful post by Eric T from 2013, it can (conceptually) be almost impossible to perfectly adjust for all aspects of the game at once. However, those who have trudged through the Road to WAR series will see the extreme amount of adjusting for context that WOI has done, where they simultaneously controlled for teammates, opponents, game-state, and many other things – getting all the way down to elements as seemingly minor as travel fatigue (e.g. home/away team performance, and impact of playing on back to back nights). This level of detail and rigor suggests to me that WAR is among the most advanced publicly available stats to date.

Regardless of if you choose to place much value on WAR/GAR, I want to emphasize that no metric will ever justify ignoring other methods of player evaluation. Given WAR/GAR says nothing of a player’s role, Rob Vollman’s player usage charts are a very complementary tool to use alongside it. I also encourage the uninitiated to check out Eric T’s straightforward primer on different metrics that can be used for player evaluation.

Now – back to GAR, and finally, some actual numbers.

What is a ‘good’ GAR/WAR score versus a bad one?

At a high level:

  • If a player has a ‘GAR’ of zero – they are equivalent to a baseline/replacement level player
  • If a player has a positive GAR, they are ‘better’ (at contributing to their team’s goal differential) than a baseline player
  • If a player has a negative GAR, they are worse than a baseline player

In order to be a bit more specific, let’s look at some data from Moneypuck’s series, which was very insightful on this front. His third article looks at:

  1. The GAR scores of every NHL player from each season in his sample
  2. The GAR scores of the players from the four conference finalist teams each year

The two charts below summarize his data.

( Note: I haven’t made any changes to his data – these are the same numbers shown in a different format)

GAR Distribution - All Players

Top Team - Player Count by GAR

From this data we can make a few observations:

  • It is very difficult for a player to pass a GAR score of even 10 in a given season
    • The first chart shows that fewer than 14% of players achieve this each year, and the second shows that even conference final-reaching teams usually only have ~4.5 players with a GAR of 10+
  • Even fewer have a GAR of 15+, at approximately 5.9%, or only 1 in 17 players in the league
  • Last – as highlighted on the first graph – over 70% of the seasons played in the NHL score a GAR of 5 or lower
    • Put differently, 70% of NHL players fluctuate in and around the league minimum level of contribution to their team’s goal-differential

Now, in order to illustrate how various players are scoring in terms of GAR – I have summarized the following table for you to compare against your own, personal eye-test.

Example Players by GAR Range

Hopefully this table helps to set some benchmarks in your own mind about how various players score on GAR. This table also makes it clear that Goalies and Defensemen are under-represented in the top GAR ranges, when looked at on a three-year average basis. This highlights an important qualifier of the GAR metric: like most NHL player evaluation, it currently best evaluates a player’s offensive contributions.

Looking at the components of GAR (the next section) will explain why: defensemen will largely contribute to just one or two of the six components (e.g. impact on shot rates), while forwards will contribute to shot rates while also providing material contribution through their shooting percentage, face-offs, and penalty drawing. As a result, when using the current GAR metric to evaluate players, it will be most accurate to compare players within positions, rather than across them. For what it’s worth, WOI previously hoped to add other defensive components to WAR, as well as a measure of play-making ability, helping to offset this gap. However, the closure of their site means these areas will not be publicly incorporated until another brave statistician picks up where they left off.

Now that we know why GAR is important, what a good/bad score is, and who typically scores where – what factors is this number actually considering?

What Are the Components of GAR/WAR?

 As WAR-On-Ice has already given a very detailed summary of the math behind the metric, I’ll instead focus on the big picture of its component parts. WAR is currently made up of the following six components for skaters, which I have grouped into the three broad categories below:

Offensive Contributions

  • Shot rate for
  • Shooting percentage

Defensive Contributions

  • Shot rate against

‘Gameplay’ Contributions

  • Faceoff win percentage
  • Ability to draw penalties
  • Ability to avoid taking penalties

Whether or not you are familiar with statistics, I think most of us can agree that increasing your team’s shot rate differential, shooting percentage, faceoff percentage, and power play opportunities, while decreasing your team’s minutes on the penalty kill, are all going to help contribute to goals and wins. As a reminder, each of these have their own definition of ‘replacement level’ that GAR is calculated against. One last side note: Goalies are calculated as their own category, based on Sv%, which I have omitted here.

Now, the fun part: 

How to Analyze a Player’s WAR/GAR – A Case Study

In order to demonstrate the various components of GAR, I have chosen a player that we all know, and who also happens to contribute at both ends of the ice: star two-way center, Jonathan Toews.

Looking at Toews’ GAR metrics in the 2013-2014 season gives us the following:

Toews 6 bars

Keep in mind: across all components, a positive GAR is a better result. E.g., Toews’ positive ‘Shot Rate Against’ GAR score (3.6) means he has been successful in reducing shot attempts against.

The above data shows us that:

  • Although Toews is a major offensive threat, a significant amount of his impact comes from his defensive and ‘gameplay’ contributions, e.g. shot suppression, face-offs, and ability not to take penalties
  • In 2013-14 Toews won more than half a game (e.g. ~3 GAR) in his face-off percentage alone, putting him among the leagues’ best
  • Toews contributes most offense through his possession and shot-attempt driving capabilities (shot-rate), rather than by being a sniper
    • In contrast, while not shown here, Patrick Kane contributes more through his strong shooting percentage, often scoring 6-7 GAR per season in Sh% alone

Introducing a style of chart I will label ‘GAR Bars’, we can summarize a player’s total GAR contribution back into those three major categories:

Toews 2013-2014 GAR BAR 

This chart shows the exact same data as the prior one; the only difference is that now I have aggregated each of the six components into their general buckets. For two final illustrations, we can expand on this bar to analyze Toews based on his GAR over time. The charts below show (i) Toews’ absolute GAR contribution over a number of seasons, as well as (ii) Toews’ relative GAR contribution, showing each category as a portion of his total.

Historical Absolute GAR

Toews Historical Absolute GAR

This chart shows us that:

  • Toews has been in the 20-25 GAR range for most of his career – an extremely high score, especially when considering only 2.4% of all NHL seasons exceed 20 GAR
  • Despite winning the cup in 2014-2015, Toews’ individual performance last season dropped to his lowest level since his rookie year
    • While ~16 is still a very good score, this decline may be indicating signs of Toews’ age, suggesting we should expect a slightly reduced level of performance from him going forwards

Historical GAR Distribution

Toews Historical Relative GAR

Finally, looking at Toews’ GAR contribution by category shows us the biggest step down in 2014-2015 was in his defensive play. Although I haven’t watched enough Blackhawks games to observe this myself, one reason this could be happening is a slight decline in skating ability/speed with age, preventing him from being as involved around the rink as he once was.

Conclusion

Now, I haven’t used this data to hammer home a unique point of view about Toews – no one needed me to quantify it to know he is a future Hall of Fame-caliber player. Rather, the point of this article has been to provide some colour behind the basics of the WAR/GAR metric, and to illustrate in a simple, straightforward manner how anyone could apply this metric to their own thinking on player evaluation.

Regardless, it is clear that Jonathan Toews is a hugely valuable player to Chicago at both ends of the ice, sitting in the elite, 20+ GAR echelon, and having peers among the likes of Crosby, Kopitar, and Ovechkin. This analysis also says nothing about the leadership skills he has demonstrated on and off the ice, taking his team to an unprecedented three cup wins in the last six years. As a result, I think we all believe that Captain Serious earns every dollar of his $10.5M cap hit. But how can we know for sure? Unfortunately – for that, you’ll have to wait for the next installment of the series, where I will demonstrate how to use WAR/GAR to quantify what a player is worth.

 

What Draft Round Can Tell Us About a Player’s Expected Long Term Performance and Development (Part 1)

This article is being posted here as well as in parallel as a guest post at Maple Leaf Hotstove. FYI, for those who read my last post – as promised – this article is a succinct summary of my draft analysis ‘report’ shared earlier. 

The hockey analytics community has looked at many aspects when projecting a player’s performance over their career: prior league, prior scoring rate, performance of players with similar characteristics, size, and date of birth – amongst others. One example of such work was an article from earlier this year by ‘moneypuck’ at NHLnumbers.com.

Moneypuck’s analysis derives its foundation from an excellent study by Michael Shuckers in 2011, where Shuckers was one of the first to create a standardized view of ‘draft pick value’. The quality of Schuckers’ analysis drove many other authors to do work that followed suit, building on his approach. However, in his paper, Michael chose to define ‘draft pick value’ entirely based on likelihood to play >200 games in the league. Although that is a reasonable metric – and few would argue that reaching 200 NHL games means a draft pick was NOT successful – there are limits to using only a single metric to define ‘success’.

How ‘successful’ a pick was, and the ensuing value of a draft pick, is highly sensitive to how we choose to define success. Is a pick successful after 40 games, 80, or 200? Are they successful after 30 career points, or 100? How about their points per game? Or their total career points? How would our definition of ‘success’ change on each of these metrics if they are a forward or a defenseman?

As you can imagine – this isn’t a simple thing to answer. Earlier in 2015, Stephen Burtch did some interesting work down this path, where he combined expected GP with expected Pts/Gm to create a new draft pick value figure – which was a big step in the right direction. However, even Pts/Gm has its gaps, given that it only considers players still in the league. As time goes on, the least successful players will leave the league sooner, increasing the average Pts/Gm of those remaining. In a perfect world, we would want a metric that has already been adjusted for a player’s likelihood to succeed in the league, rather than one based on his success if he can stay in the league. (Though – to be fair – Burtch does seek to address this point through multiplying probability of reaching 200 games by expected Pts/Gm).

In order to address these points, I have taken a very detailed look at long term player performance and development based on draft round, incorporating a wide range of metrics into my analysis. Specifically, I have reviewed the five years of players drafted from 2000-2004, as well as the ensuing 9-13 years of NHL season data.

Arguably the biggest factor in whether or not analysis is put into practice is if a team’s front office and coaching staff truly understand it, and believe the results of the analysis enough to buy-in to it – which will often come down to the method by which that analysis is communicated. As such, I have tried to simplify the statistical methodology involved in this work, and display the output visually in a way that is easily understood and hopefully very accessible to stats and non-stats folks alike.

(As a note – this article focuses strictly on metrics related to player performance and development. However, a natural follow on to this is then connecting that information to draft pick value, as mentioned, and after that, how successful teams have been in drafting – both of which are covered in my full report, originally posted here). 

What I Hope To Answer

The objective of this analysis is to investigate ‘typical’ player performance and development trajectory after being drafted in a given round, in order to answer the following three questions:

  • If a player is drafted in round X, and is ultimately able to make the NHL, by when should they be expected to be a contributing NHL player?
  • How well does the typical player perform over the course of his career (on various metrics) after being selected in a given round?
  • Within the first round, how do the top 10 overall picks perform versus those taken 11th-30th?

So – let’s get into it.

Analysis of Long Term Player Performance and Development by Draft Round

I have split out the upcoming sections of analysis by each type of metric used. I will then revisit the three questions above directly in the final section on conclusions.

Games Played Thresholds

As Michael Shuckers showed very clearly – players drafted in the first 2-3 rounds are much more likely to appear in the NHL; however, the likelihood of a playing one or more full seasons diminishes substantially after the first round

Games Played Pic 1

Games Played Pic 2

In terms of player development, this data suggests that:

  • If a 1st round pick hasn’t played a game by their fourth potential NHL season, they likely will never appear in the NHL
  • 20-30% of successful 2nd and 3rd round players only begin to meaningfully play for their franchise between 5-7 years after being drafted (e.g. the pink shaded area on the ’80 games played’ chart)
  • The gap between the top 10 overall and the rest of the first round is actually relatively small when looking at the likelihood to pass the 150 game threshold (especially in comparison to metrics later in the article)
  • And, as we know – all other rounds after the first three appear to have close to equal likelihoods of producing long term NHL players

Points / GM Data

Forwards taken in the top 10 overall show an unbelievable ability to outperform in P/GM over their careers (which Stephen Burtch has shown is even more distributed within the top 10¸ where the top 1-3 picks overall are meaningfully better than picks 4-10)

Pts per Game - F

  • Interestingly, 2nd and 3rd round forwards tend to increase their per-game output over time, largely converging with players drafted 11th-30th overall
  • However, given this metric is an average of those still playing, there will be a survivorship bias that partially drives this effect
    • E.g. Low producers will leave the league more quickly, increasing the average for those remaining – as shown by the fact that 30% of the players shown are from Rd 1 in season ‘6’, this increases to 40% by season ‘10’
    • This data can more reasonably be said to tell us that, in order to stay in the NHL over the long term, a forward must achieve a minimum of roughly 0.20 points per game

Defensemen naturally display a much more narrow distribution of results, accounting for the fact that a ‘strong’ defenseman will not always play a significant point-scoring role

 Pts per Game - D

  • P/GM data for defensemen is not terribly insightful, but I have included it in order to provide the data for those interested
  • One note – If you look closely, you can see surprisingly strong (and erratic) performance of Round 5 defensemen – starting very weak (no points registered in season ‘2’), but then ultimately being among the highest points per game in seasons ‘5’ through ‘10’
    • This particular point is driven by a small sample size issue: 49 D were drafted in the 5th round, but only a handful played many games – three of whom happen to be John-Michael Liles, Kevin Bieksa, and James Wisniewski

Points Scored Thresholds

A player’s likelihood to surpass the 30-point threshold tends to resemble their likelihood to pass ~150 career games played…

Pts Threshold Pic 1

… However, players drafted in the third round fall behind in terms of likelihood to pass the 100-point career threshold

Pts Threshold Pic 2

  • Where earlier charts show strong similarities between the long term potential of 2nd and 3rd round players, the ability of those taken in the 2nd round to break the 100-point career threshold is a clear differentiator between the two
  • Based on this, teams may do well to target top scorers in rounds 1 and 2, before moving to defensemen, shut down forwards and goalies in the third round and onwards
  • Again, top 10 overall picks differentiate themselves here as well, with over 70% passing 100 career points

Cumulative Career Points

The ideal metric to compare performance by round must be adjusted for players with limited NHL careers – which brings me to Lifetime Production, or Cumulative Career Points Scored

Lifetime Production - F

Lifetime Production - D

(Note – Forwards and Defensemen are shown on different scales)

 

 

  • Here, 1st round picks wildly outperform all others, showing that the combined skill and typical longevity of even a mid-to-late 1st round player (11th-30th) will equate to an average 159 points over 10 seasons for forwards, and 105 points over the same timeframe for defensemen
  • This compares to the significantly lower 68 average career points for second round forwards, and 44 average career points for second round defense
  • Notably, third round forwards also re-assert their value here, showing that – although they will only typically produce a total of 36 points over 10 seasons – they still will consistently outperform rounds 4-9 in career points

Drawing Some Conclusions

Having now walked through each chart and its meaning, I want to summarize my findings from above. To do so, let’s revisit the original list of three questions:

If a player is drafted in round X, and is ultimately able to make the NHL, by when should they be expected to be a contributing NHL player?

  • First round players typically make their initial NHL appearance within 1-2 years, and will almost always have played their first full season (~80 games) by their fourth year after being drafted
  • 2nd and 3rd round players take much longer to develop, and many only play a full season by their 5th-7th years after being drafted
  • Players who haven’t played by these general timelines become highly unlikely to ever make serious NHL contributions (>1 season played)

How well does the typical player perform over the course of his career (on various metrics) after being selected in a given round?

  • Most players drafted outside the first round never make the league at all (2nd round players have a 60% likelihood of playing one game, and a 35% likelihood of playing a full season; for 3rd round players, closer to 40% play one game, and only 28% play a full season)
  • Based on their combined likelihood to play 2+ NHL seasons, score 30+ NHL points, and reach 0.4-0.5 or more pts/gm, 1st, 2nd and 3rd round players are the only players with a meaningfully higher likelihood in succeeding in the league
  • However, based on the likelihood to score >100 NHL points, 1st and 2nd round players are able to separate themselves from the 3rd round as well

Within the first round, how do the top 10 overall picks perform versus those taken 11th-30th?

  • The top 10 overall picks are significantly more capable than all others, even versus their first round peers
  • Over 70% of top 10 overall picks pass 100 career points, typically after ~6 seasons, versus 50% of those picked 11th-30th, who often take 9-10 years or more
  • Only forwards taken in the top 10 overall can truly be expected to score 0.6-0.7 pts/gm or more over their careers (although there are many examples of players who perform at this level of production that were taken outside the top 10, such as Ryan Getzlaf and Corey Perry)
  • In a hypothetical trade for active players, a ‘typical’ top-10 overall pick should be treated as likely reaching >350 career points as a F, or >170 as a D – thus, one-for-one, a team should be expecting to get a true star player in return if they are giving up a potential top 10 overall pick

In the end…

The long term performance expected of a player based on their draft round is something that is highly relevant to teams throughout their decisions in trades, on draft day, and in supporting a player’s development over his career. Hopefully you have found this analysis to be interesting, and found that the work was also able to build upon what is already out there by expanding the range of metrics that we look at. As mentioned, the PDF I linked to above also begins to apply this to both a revised (and straightforward) metric of draft pick value, as well as to answer the question of ‘Which teams were the most successful?’ in the draft years studied. Keep an eye out for ‘Part 2’ of this article – where I apply the data above to the trades done by the Leafs on Draft Day last summer, in order to see if Hunter, Dubas and friends were winners or losers in their exciting deals…

This article is presented by OAK Coasters, a website where you can by beautifully crafted, hand made One of A Kind (OAK) coasters that make the perfect gift. Check them out at OAKCoasters.com.