Protecting the Blue Line and Driving Break-outs: Repeatability and Impact of Exits & Entries Against

This article is being co-posted on NHLNumbers as well as on my own site, OriginalSixAnalytics.com. Find me @OrgSixAnalytics on twitter. If you are able, please provide some support to public data providers Corey Sznajder (@ShutDownLine), Corsica and Puckalytics

Every team in the NHL is constantly striving to find market inefficiencies to capitalize on towards the goal of creating value and ultimately winning games. Whether done by finding undervalued assets or developing a set of sustained competitive advantages – every team should always have these concepts in mind.

The analytics community has widely acknowledged we are not as good at evaluating defensemen as we are at evaluating forwards – in part due to having fewer effective metrics available. As a result, a number of people have spent time trying to expand our knowledge in this area – as any new development would be sure to present a strong market opportunity to capitalize on.

A particular focus has been on applying Eric Tulsky’s historical zone entry and exit analysis to defensive evaluation. Given controlled entries into the offensive zone (carry-ins) have been proven to generate more shots than uncontrolled entries (dump-ins) – intuitively, if a defender can prevent controlled zone entries against his team, that should reduce shot attempts against.

There has been a lot of great work done on this subject, including articles by the likes of Sean Tierney, Dom Luszczyszyn, JenLC and more. Dim Filipovic at Sportnet has written a number of times on the value of tracking the percentage of entries against that are controlled, as well as the number of controlled zone exits that a defender creates. This type of work has broken ground recently in evaluating defensemen. However, in an article late last season Dimitri summarizes the current gaps in the analysis as well:

“It’s important to retain perspective on all of these figures. At this point they’re much more descriptive than predictive, in the sense they’re telling us how a certain player did in the games he’s already played, but not necessarily helping us forecast how he’ll do in the future. There’s also the fact that most players only have [only] ~10 games worth of data to their name, which leaves plenty of room for inexplicable swings in either direction.”

So far, the best evidence to support the repeatability of these metrics is on the zone entries against side – where Eric Tulsky used a full season of tracked data for the Philadelphia Flyers’ defensemen to test this. His results are here. While this analysis definitely hints at a relationship, I suspect Eric himself would say a sample of only six defensemen should be seen only as a ‘first step’ that can built upon in the future.

As such, my goal in this article is to test, replicate and expand Tulsky’s work, as well as to hopefully broaden it to some uncharted territory. I also hope to encourage others to recognize the amazing data being made available by Corey Sznajder’s tracking efforts – which you can access here and here. Hopefully others will decide to support Corey by purchasing his data, and do further work to expand our knowledge in these areas as well.

Definitions

Let’s start with some important definitions:

  • Zone Entry Attempt Against (ZEA) – Any time an opponent is attempting to bring the puck into your defensive zone; it is registered against the player (e.g. defenseman) who is most closely defending the play across the blue-line
  • Controlled ZEA – When the opponent gains access to your defensive zone by successfully carrying-in or passing-in the puck, retaining possession
  • ‘Break up’, or ‘Failed Entry Against’ – When the defender is able to prevent the entry from gaining access to the zone, turning over possession
  • Zone Exit Attempt – Any time a skater is attempting to move the puck across his own blue line, whether carried, passed or dumped/cleared out
  • Zone Exit with Control – Any time a skater successfully carries the puck out or passes it out of his own zone, retaining possession

A simple way to think about preventing controlled ZEA’s is that it is the act of ‘Protecting the Blue-line’. Likewise, individual controlled zone exits represent players who are strong puck-movers, or who excel at ‘Driving Break-outs’.

Key Questions

With respect to preventing controlled ZEA, being able to break-up plays, and driving controlled zone exits (whether as a % of total, or per 60 minutes of play), my key questions are as follows:

  • Predictive – Are these repeatable skills? Is past performance a reliable indicator of future performance?
  • Relationship with Goals – Do these metrics drive improvement in Goals For Differential (GF%), by either increasing goals for or decreasing goals against?
    • If not, do these metrics at least help to drive the other inputs that translate into GF%? (I.e. xGF%, Shots, Corsi-For%, etc.)

“Protecting the Blue Line” – Zone Entry Defense

Let’s start with protecting the blue line: preventing controlled entries against, or breaking up entry attempts. To be clear, a lower percentage is better – the lower the proportion of controlled entries against a defenseman, the fewer goals and shots against you would expect. (In the interest of time, I have focused more so on ZEA, rather than break up %.)

Repeatability

To test repeatability I have done a ‘split-half’ test to see how well the 1st half of a season of data predicts the second half. All data is from Corey’s 2013-2014 tracking – the only full season of publicly-available tracked entry data. I have only included defensemen with over 150 entries-against in each half, for a total sample of N=148. (Note – the mid-season mark wasn’t technically the 41st game as Corey only captured zone entry defense for the final 60-65 games).

1-protecting-blue-line-repeatabilityAs you can see above, mitigating controlled entries against % is a highly repeatable skill with a relationship (R^2) of 0.37 between first half and second half results. I think this is a very exciting finding, as it shows the recent work of Dimitri, Sean and others is spot on – and that past performance should be predictive of future performance.

Relationship with goals

The next question is ‘does this improve GF%’? Intuitively I think we all expect it would, but it is still important to check. To evaluate this I tested the relationship (R^2) between controlled entry against (% and per-60) and a wide range of metrics: goal differential, goal and shot against rates, and ‘rel’ stats – i.e. stats that adjust goal and shot rates to be more focused on individual, rather than team-level, play.

2-zea-v-goals-table-incld-rel

As you can see above, there are some interesting results. First, many on-ice stats are significantly impacted by these metrics. In particular, ZEA with control (% and per 60) show a meaningful relationship with xGF%, CF% and CA/60, which are all at an R^2 around or above 20%.

However, it is interesting to note that when you focus on individual impact (i.e. rel stats), the relationship is considerably weaker than with unadjusted on-ice stats, which are more influenced by teammates. As such, while preventing controlled ZEA is repeatable, and does meaningfully impact goals and shot rates – there may also be team/system impacts that are difficult to separate out from the individual contribution.

Last, let’s ‘zoom in’ one on of the cells above: % Controlled Entry Against vs xGF%. the chart below illustrates the direct relationship between reducing controlled entries against and improving expected goals.

3-zea-v-goals

 “Driving Break-outs” – Zone Exits with Control

Now, let’s look at the zone exit side, or ‘Driving Break-outs’. Comparing the Exit data to ZEA, there are a couple nuances to mention: first, the data is slightly more limited in this area. Corey provides raw tracked data in 2013-2014 for all offensive zone entries and for all ZEA – which allowed me to separate it into split-half components. Unfortunately, I checked with Corey, and for the 2013-2014 zone exits the tracking was done offline and aggregated at the end. As such, there is no full season of raw data that we can use to do a repeatability test.

On the positive side, Corey has been providing the raw data for his new 2016-2017 tracking work which is well underway (maybe 10-15+ games tracked per team thus far). As such, by the end of the season, we should have another full year of data – and I will at that point confirm whether or not Controlled Zone Exits are a repeatable skill. My guess is that this will ultimately be shown to be the case, as simply from watching, certain defensemen seem to be natural puck-movers (Karlsson, Josi), and others seem to be less so (Polak).

Last, the 2013-2014 zone exit data lacked some key elements (which Corey has done well to include in the 2016-2017 work). The old data did not have a broad ‘failed’ zone exit category, nor did it show what Corey now calls ‘Transitional Plays’ – i.e. getting the puck out of the zone while surrendering possession via a clearing attempt. As such, we can’t say the proportion of the time an individual’s exit attempts were successful. However, we are still left with some very valuable data – individual Zone Exits/60 and Controlled Exits /60 – as defined above.

Relationship with goals

So – if we assume for now that Controlled Exits are a repeatable skill (and not simply randomness) – do they have a strong relationship with goal events?

4-zexits-v-goals-incl-rel

The chart above shows some very interesting findings. First, zone exits with control have a fairly meek impact on GF% and xGF% directly. This isn’t entirely surprising, as exiting your own zone with control can only get you so far towards scoring. However, there is a more relevant relationship between Controlled Exits/60 and CF% – with an R^2 of 15.5%. This shows that having a defender who is strong puck mover can definitely begin to ‘move the needle’ on winning the shot-differential battle.

Lastly, what I found most intriguing here were the Rel Stats – where the metric is adjusted for how a player’s team does with him on the ice versus off the ice (i.e. isolating individual contribution). While teammates can play a bigger role in helping a defensemen prevent controlled ZEA, Zone Exits are actually the opposite – both CF/60rel and CF%rel have a ~19% relationship (R^2) with Controlled Exits/60, stronger than CF% alone. This demonstrates that strong individual play from puck-moving defensemen can have a major CF% impact strictly from their ability to exit the zone with control.

Before wrapping up, let’s ‘zoom in’ again on the relationship between Controlled Exits/60 and CF%rel.

5-exits-60-v-cfrel

The chart above shows how a Defensemen’s individual controlled exits/60 have a meaningful impact on his ability to be a net-contributor to his team’s Corsi-For %.

Tying it all together – what these two sets of analysis tell us is that a great puck-moving defender can have a substantial impact on his team’s shot-attempts, as well as overall shot attempt differential, through his ability to Drive Break Outs. Further, a defenseman who excels at Protecting the Blue-line will have the greatest impact when surrounded by teammates and playing within a system that does so as well.  Naturally, the greatest value will be found in acquiring players who show both these skills , even if they don’t necessarily contribute significantly in terms of goals and points.

Conclusion & Practical Applications for Teams

As should now be clear, these types of new metrics for defensive evaluation are very interesting and important. They appear to be repeatable skills (at least for ZEA), and they do have a material impact on a team’s xGF% or at the very least, its CF%. While they are visible to anyone watching a defenseman closely – just like batting average or save % – only over dozens of games of tracking can we truly quantify and understand how well a player performs on each.

So – practically speaking – what should NHL or teams at other levels do about this? Gather the data! At a minimum, every team in the league should be tracking their own player’s performance on these metrics in order to better game plan, develop, and observe improvements or deficiencies over time. Further if any team ever took a systematic effort to track these on a league-wide basis (like – say – the Florida Panthers…), it would almost certainly present a clear competitive informational advantage that can help them build value over the long term

Such a team could have a greater understanding of the value of their own players, and also directly identify defenders around the league that may have a big impact on their team’s results, despite low goal or point totals. At this point, teams should be compelled to begin acting on this knowledge to build an ‘edge’. If not, eventually they may be doing so just to prevent themselves from falling too far behind.

 

This article is presented by OAK Coasters, a website where you can buy beautifully crafted, hand made One of A Kind (OAK) coasters that make the perfect gift. Check them out at OAKCoasters.com.

 

Advertisements

Introducing “KPO%”: Why Mitigating Shot Location Might Be the Next Important Layer of Measuring Defensive Value

 

This article is being co-posted on Hockey Prospectus as well as on my own site, OriginalSixAnalytics.com. Find me @OrgSixAnalytics on twitter.

 Although hockey analytics has come a long way, there is a still lot of room for improvement – particularly when evaluating the defensive contributions of skaters. Most analytics users are well aware of shot rate (CF%/relative CF%) and shot suppression (CA/60) stats by now – but after that, there aren’t many other easy-to-use defensive metrics. As a result, ‘single number’ stats like Wins Above Replacement from the (former) website War On Ice (WOI), often seem to undervalue defensive players.

In order to find another dimension for defensive evaluation, a logical area that many authors have thought to test is whether a skater can influence his goalie’s Sv% while he is on the ice. For those who haven’t been following, this is a hotly contested topic; but the short summary is that it is extremely hard to tell if players can actually influence On-Ice Sv%. Studies that show skaters can influence On-Ice Sv% tend to be inconclusive, at best – and most work has suggested that impacting On-Ice Sv% is largely driven by randomness.

Intuitively, many think it should be possible for a skater to impact Sv%, so the work continues. However, the most fundamental question that we are all asking is really, ‘What are the best tools we can use to measure a skater’s defensive contribution?’ So – let’s attack the problem at a slightly higher level.

Underlying Drivers of Sv% Impacts

Presumably, if skaters could impact on-ice Sv%, they would do so by reducing the ‘quality’ of the shots taken against his team – easier shots against, fewer goals. Even simpler than ‘quality’, if a skater can consistently mitigate the location of the shots against his team – e.g. ‘keeping pucks to the outside’ – we know that the decrease in Sh% as shots are taken further from the net should ease the burden on his goalie; regardless of whether the goalie actually stops those shots.

Fortunately, websites like Corsica and the former WOI having created quite rigorous Scoring Chance (SC) metrics that we can use to test this. With these, we can measure Scoring Chance mitigation two ways: through an overall rate stat (e.g. SCA/60), or as a proportion of all shot attempts against (as used by Scott Cullen, here). In Scott’s article he simply divides SC Against by Corsi Against, allowing us to see what portion of all shot attempts are Scoring Chances when a certain player is on the ice. Due to its straightforward nature, I’m sure many others have used/alluded to this figure in the past.

Although this stat is not at all complicated, in this article I will explore the idea that mitigating shot location/quality could actually be one of the next important layers in quantifying a player’s defensive contribution. Granted, some of the most complex, advanced models (e.g. xG from Corsica) already do go to great detail to factor in shot quality. Despite the value of those models, I hope to make the case that a statistic like Scott’s (On-Ice SCA/CA) can represent a simple, broadly usable metric to evaluate defensive contributions from skaters – a close second to things like CA/60 and CF%.

To make this argument, we need to know: (i) is this a metric that skaters can actually ‘influence’? To test that, we will have to see (ii) if past results are predictive of future results – e.g. do certain players perform the best/worst on this metric, year after year? Along the way, we should also figure out (iii) is it best to use Corsi Against, Fenwick Against or Shots Against as a denominator?

So – let’s dig in.

Defining Scoring Chances

First – let’s define what a ‘Scoring Chance’ (SC) is. @MannyElk has done a great job recently creating a Scoring Chance stat on his Corsica website, and all citations of ‘Scoring Chances’ here use his data and metric – so big ‘thank you’ to the hard work that he does. You can support Corsica here.

Manny goes into great detail on how he reached his metric here. In short, he built upon the War-On-Ice SC definition by putting shots into three danger ‘tiers’ (high, medium and low) – though Manny didn’t stick to the exact locations used by WOI. Instead, he focused on the likelihood of the shot to be a goal (based on a number of factors, like shot angle, rebounds, etc.), and worked backward into his ‘zones’ from there.

Below is Manny’s heat map of shot location by danger zone.

 corsica-heat-map

The next table is Manny’s summary of the Fenwick Shooting %, Shooting %, and percentage of all shots within each danger ‘tier’, or zone.

corsica-table

What is important about this table is the third column from the right. This ‘FSh%’ column summarizes just how dangerous each shot attempt is: low danger attempts have approximately 2% chance of going in, medium danger have a ~6% chance, and high danger (e.g. Scoring Chances) have ~16%  chance of becoming goals. Notably, the medium tier was deliberately set to be quite close to the league-wide ‘average’ shot attempt Sh%, of 6.79%.

Here is Manny’s definition of a Scoring Chance:

“Scoring chances may be defined as unblocked shots belonging to the High-Danger zone – that is, whose xG is equal to or exceeds 0.09. For convenience, one can approximate that one goal is scored for each 6 scoring chances”. [As compared  to 1 in ~16 medium danger chances, and 1 in ~50 low danger chances].

So to be clear – it isn’t quite as simple as ‘if a shot is in the mid-to-low slot, it is a Scoring Chance’ – which is closer to the WOI definition. However, as you can see from his heat map, the vast majority of SCs are originating from the dense yellow area in the mid-to-low slot – so we can consider SCs as largely coming from that location.

Mitigating Scoring Chances – Keeping Pucks Outside % (“KPO%”)

 Earlier I introduced Scott Cullen’s metric of (Scoring Chances / Corsi Against). Most players in the league come out in the 10-20% range of this number, meaning that 10-20% of their shot attempts against are ‘Scoring Chances’.

In order to make this metric somewhat more intuitive, I want to center it on the concept of ‘Keeping Pucks to the Outside” – a simple, easily understood concept that is core to defensive-zone play. As such, I will make two changes to the stat:

  • Instead of showing the % of shot attempts that ARE scoring chances – instead, I will show the metric as the % of attempts that were NOT scoring chances (simply by taking (1 – Scott’s metric).
  • As a result of this, I will give this stat a new name – “Keeping Pucks Outside %”, or KPO% – the percentage of shot attempts against that a skater prevents from being a Scoring Chance, or that he ‘keeps outside’.
    • (As a side note – I have deliberately tried to make this label clear and straightforward, for use by coaches or players who aren’t familiar with most analytics. For those who want a more formal name – you could also use ‘Scoring Chance Mitigation %’)

As a result, most players will instead be in the ~80-90% range – and should be aiming for as high of a % as possible.

KPO % – Repeatability

 Now, the most important question for this to be a relevant metric is – can skaters actually repeatedly ‘influence’ KPO%? To determine this, we will have to test past results against future results, to see how strong that relationship is.

To do so, I downloaded Corsica data for all Forwards and Defensemen who played from 2010-2016. 2010-2013 represents the ‘first half’ sample and 2013-2016 represents the ‘second half’, and players needed at least 1000 minutes in each. This resulted in a sample of 216 Forwards and 113 Defensemen, which I tested separately. Only 5v5 data was included.

The two charts below summarize the results.

defensemen-correlation

forward-corellation

As you can see, across both D and F there was a considerable relationship between past and future performance on the KPO% metric, at R^2 = 28.9% and 20.6%, respectively. This suggests that KPO% has solid predictive capability, supporting its use for player evaluation. Intuitively, it also makes sense that Defensemen would be able to more consistently influence this metric (shown in the higher R^2), as it is a larger part of their role.

It is worth noting that the charts above use SCA/CA to calculate KPO%, as CA had the strongest relationship tested. I also ran the results with SCA/Fenwick Against, and SCA/Shots Against, and the results are below:

rsq

On the defensive side, CA and FA are quite close, but after that there is a slight drop down to SA. I think it is also positive to see that Corsi Against has the strongest relationship, when we area including blocked shots in the denominator. Given that blocking a potential Scoring Chance is a meaningful way for a skater to add defensive value, it would be logical to include that in the calculation.

For the last few sections, I will quickly summarize how performance on KPO% tends to be distributed around the league, and if we can quantify how much ‘value’ it really contributes.

2015-2016 League-wide Performance

 In order for KPO% to have value, there needs to be a wide-enough distribution of results across the league in order for players to differentiate themselves. Below is a histogram of the distribution of defensemen on this metric, across the 2015-2016 season:

2015-2016-histogram

 (Note – I have omitted the forward chart as it follows the same general pattern)

Using the 2015-2016 season shows KPO% as following a relatively normal distribution, and with a reasonable variation of results, given the range of 8.7%. Given there is a moderate amount of variation across the league – how big of an impact does a change in KPO% have on expected goals against?

What is +/- 1% of KPO% actually worth?

Now – I want to try to understand how big of an impact the best players in the league can have on KPO% – two good examples from the sample were Mark-Edouard Vlasic and Roman Josi, scoring at 89.4% and 88.1%, respectively.

To answer this, I have done a very basic, ‘back of the envelope’ calculation for how many theoretical ‘goals’ a skater adds to his team over a season (at 5v5) if he were to have a KPO% of +1% or -1% from the league average.

goal-value So, to walk through the high level math here:

  • The average defenseman from the original sample had 997 5v5 Corsi Against over a season
  • 8% of those are HD SCA, on average – or 147.6 Scoring Chances
  • Increasing a skater’s KPO% by +1% above the league average results in 137.6 HD SCA per season, or a reduction of 10 SCA
  • With 6.2 SCA per goal, that is 1.6 goals prevented
  • However, given these shot attempts are being substituted by lower quality chances, we need to add-back the value of those chances:
    • Manny’s table showed LD and MD chances each make up ~40% of all shots – or roughly 50/50 split of all non-HD shot attempts
    • Thus, for each 10 HD SC mitigated, there will be 5 LD and 5 MD added back, or 0.4 goals ‘substituted’
  • Thus – the net goals prevented from a skater improving his KPO% by 1% is 1.21 goals per season

The one big caveat: the KPO% I am using is derived with Corsi Against, while Manny has only been able to calculate the Fenwick Sh% of his Scoring Chances. As such, the number of chances per goal stats (6.2, 16.0, 51.3) are proxies – and we should consider this calculation to be illustrative of the ‘directional’ impact, rather than actual.

How big of an impact are 1.2 goals per 1%? With a range of 8.7% across the sample, 1.2 goals means the best player on KPO% is contributing ~10 goals prevented over the season more than the worst player. If we define ‘replacement level’ at approximately the bottom 20% of the league – then the top quartile of defenders in the league could add roughly 3.5-4 goals ‘above replacement’ on this metric. Given 6 goals are approximately equivalent to one win, the top 25% of the league is adding roughly 0.50-.66 of a win for their teams – which is not immaterial in a league where every little edge counts.

For the sake of clarity, I am not arguing that KPO% is ‘more important’ than Corsi Against/60 – e.g., if a skater gives up 300 additional CA over a season, that will more than offset a reduction of KPO% by 1%. Rather, I am arguing that these two elements are important to consider in conjunction with one another – as having a poor CA/60 can be somewhat mitigated by a strong KPO%, just as a great CA/60 can be off-set by a terrible KPO%.

Along those lines – if we were to add this to the WOI Goals Above Replacement (GAR) calculation, KPO% is looking like it could be a close 2nd to shot rate stats for the highest-value way to measure a skater’s defensive contributions. Given Forwards have ~5 areas where they add goal-value to WOI GAR – versus ~2 for D-men (CF/CA) – simply adding another element of defensive contribution will help to off-set the F/D value imbalance in today’s metrics.

 Top/Bottom KPO% Defensemen

Before concluding, I wanted to share the top 12 and bottom 12 KPO% performing defensemen from the 2015-2016 season, for reference. In the table below, I have also added the column ‘KPO% Above Average’ – this is simply expressing a player’s KPO% minus the league average score. Thus – the top players will be a positive figure, distance above average, and the bottom players will be a negative figure, distance below average.

top-12

bot-12

As you can see, there is an interesting set of defensemen in each category. The top defensemen have some well-renowned players like Vlasic and Josi, mentioned earlier, as well as some not-necessarily-analytically-loved players like Shea Weber and Roman Polak.

My own hypothesis for why we get this result is that there may be a connection between a defensemen’s play-style/skill set, and his resulting shot rate/KPO% stats, in a sometimes off-setting fashion. For example, Polak and Weber’s ‘stay at home’ style may help them lock down the front of the net defensively, while it causes them to struggle on the shot rate side of the equation. On the other hand, some of the league’s more dynamic defensemen (not listed, but Doughty and Klingberg both come out at roughly -2%  KPO% Below Average) may have quite strong Corsi stats, but their play-style causes them to give up higher quality chances against as a result. Granted – this is just a hypothesis – only more study and time will tell.

Conclusion

Despite having gone all the way from introducing KPO% to taking a high-level estimate of its goal—value, I definitely see this analysis as exploratory, rather than ‘complete’. Hopefully this article encourages some others to dig into the KPO% metric (or other, similar ones) – allowing us to continue to learn more about how to measure individual-skater defensive contribution outside of simply shot rate stats.

Some future areas to build on this analysis include adding the impact on and value in special team (e.g. PK) situations, creating more detailed versions of the stat (e.g. KPO% relative to teammates, usage adjustments), or to develop a more statistically rigorous calculation of its goal-value. Hopefully you have found this analysis to be interesting and thought provoking – or alternatively, that KPO% helps to decrease the number of On-ice Sv% debates in the world…