Faceoffs in hockey aren’t particularly something that I’ve thought much about because I’ve never had any good ideas on how to quantify them. The first piece of research that I read regarding faceoffs in women’s hockey was this Big Data Cup project from Shayna Goldman, Mike Murphy, and Alyssa Longmuir. They discovered that winning a faceoff in the offensive zone frequently contributes to a shot attempt, making faceoffs worth more in the women’s game than the men’s.

The second time that I started thinking about faceoffs was (once again) thanks to Mike when he tweeted about Jillian Dempsey’s stellar 23-6 faceoff record against the Minnesota Whitecaps on November 7th, a 1-0 win for the Boston Pride.

With the recent development of the fastRhockey package to pull play-by-play data from the Premier Hockey Federation (PHF), I figured that exploring faceoffs through the lense of Jillian Dempsey’s skill would be an interesting test of the package.

The first place I started was with Raw Win-Loss records in the PHF; 23-6 seemed like a really good record, so I started by comparing Dempsey’s unique faceoff W/L combos against the unique records in the PHF. Turns out that Dempsey has been dominant relative to the league.

Above is a representation of every unique faceoff W-L record that has been recorded in the PHF (that we have faceoff play-by-play data for) where Dempsey’s games are in gold and the leagues are in grey.

In almost every single game that Dempsey has played, she’s won more faceoffs than she’s lost. Not only that, she’s done it with an efficiency that no one else in the league has matched. Her 23 faceoff wins tied the PHF record (Kendall Cornine on January 11th, 2020), but Dempsey did it in just 29 faceoff opportunities, where Cornine took 40 opportunities to accumulate 23 wins.

Dempsey holds a 22-5 game, numerous games with 15+ wins, and an 18-2 game. PHF players, as a whole, have occasionally approached those numbers, but that’s comparing Dempsey alone to the entire league. This graphic alone showcases that Jillian Dempsey is in a league of her own when it comes to winning faceoffs in the PHF.


ELO Ratings

After visualizing Dempsey’s brilliance, I wanted to figure out a way to try and evaluate player skill on faceoffs beyond using raw W-L records.

To me, the hardest part of measuring this sort of skill is accounting for the strength of the opponent. While W-L% is helpful (because winning faceoffs is helpful), there’s clearly a difference between winning one against Dempsey or winning one against the worst faceoff player in the league.

Thinking through the problem this way led me to using Elo Ratings to measure faceoff skill since a faceoff is a zero-sum game where one player wins and the other loses.

I was first aware of Elo Ratings in the context of rating chess players or serving as team-level rating. Turns out, it’s alsao been used to rate NHL skaters on faceoffs by multiple people (Tyrel Stokes, HockeyEloRatings, and more). (You can find a good Elo primer here). However, as with many things, no one has developed skater-level Elo ratings for women’s hockey on faceoffs. So, I did!

With Elo Ratings, everyone starts from the same point which, in this instance, is 1500. So, every time a PHF player takes their first faceoff, they begin as average, with an Elo score of 1500.

As you win (or lose) faceoffs against other players, your Elo rating begins to change. The idea is that if a player with a high Elo beats a player with a low Elo, neither score changes much, because that’s expected! However, if the reverse happens and an underdog beats a high-level player, there will be a much more drastic change in their post-faceoff Elo ratings.

The other key point is that the changes in Elo will always equal 0; so when player 1 wins +10 points, player 2 loses -10 points to their Elo, and vice versa.

The formula to calculate Elo is below:

\[ ExpWinPct_{Player} = Elo_{StartPlayer} / (Elo_{P1} + Elo_{P2}) \] \[ Elo_{winner} = Elo_{start} + K * (1 - ExpWinPct_{P1}) \] \[ Elo_{loser} = Elo_{start} + K * (0 - ExpWinPct_{P2}) \]

One of the fun components of calculating Elo is that is has a built in Expected Winning% to it, which is easy to calculate by taking the Elo of an individual player and comparing it to the total Elo of the players involved in the faceoff.

So, if Player A enters a faceoff with a 1800 Elo and Player B enters at 1200, we’d say that Player A has a 60% Expected Winning% (1800 / 3000), while Player B is at 40%. This results in the system that rewards upsets and only assigns a few points for winning faceoffs that you’re supposed to.

The magnitude of the change in Elo is determined by the K parameter. For PHF faceoff Elo, I chose to set this at 25. Why? Because I wanted to. High K values result in more volatility after an event, while lower K values make it harder to dig yourself out of a hole. There is definitely more that I could do to tune this K, but I felt that the gain from adjusting my K would be marginal at best.

It’s a simple system and equation, but one that took my a little while to figure out how to code. (I’m about to dive into the code that I wrote to produce the Elo ratings, so if you don’t want to listen to me ramble about it, you can skip to the results section. Though, I realize as I write this that I’ve been rambling about how Elo and the equations work so, if you’ve made it this far without skipping, you’re probably not skipping the code talk. Anyways, time to get back on topic).

The first step was to take the raw PHF play-by-play data and turn it into a representation of the two players involved in a faceoff and who won. In addition, since the order of who you play does have ramifications at the Elo-level, it was important to keep track of the order in which each faceoff occured.

Since I’ve saved all pre-2021-2022 PHF data into it’s own separate GitHub repository from the fastRhockey package, I loaded the old data from there and used the scraper for the new data.

I re-shaped the play-by-play data to produce a few important columns. First, I combined both the winning and losing player into one column with the winner/loser ID in a second column. In addition, I added a faceoff_id column to provide a sequential ID for all faceoffs, and added faceoff_played, which is how many faceoffs that player has played by that point.

## # A tibble: 6 x 5
##   game_id faceoff_id result player             faceoff_played
##     <int>      <int> <chr>  <chr>                       <int>
## 1  268078          1 winner Sara Bustad                     1
## 2  268078          1 loser  Hanna Beattie                   1
## 3  268078          2 winner Becki Bowering                  1
## 4  268078          2 loser  Hanna Beattie                   2
## 5  268078          3 winner Becki Bowering                  2
## 6  268078          3 loser  Sarah Schwenzfeier              1

From there, I opted to create a for loop to run through all the faceoffs in the pbp data to derive Elo Ratings. There may be a more efficient way to do this, but for me, this worked well.

#### Looping through all the recorded faceoffs in the current season
for (i in 1:max(data$faceoff_id, na.rm = TRUE)) {
  # selecting the next faceoff in the sequence
  # this will return a dataframe that is two rows long with the above columns
  fo <- data %>%
    dplyr::filter(faceoff_id == i)
  # once the dataframe has been selected, it splits into an if statement
  # the statement evaluates whether or not this is the "debut" faceoff for both players
  # if this is the first faceoff that each player has partcipated in, it's easy
  # 1500 is listed as the default value for each and does the calculation
  if (max(fo$faceoff_id, na.rm = TRUE) == 1) {
    # defaulting to 1500 as the starting elo when neither player has played a faceoff 
    fo <- fo %>%
      dplyr::mutate(prev_elo = ifelse(faceoff_played == 1, 1500, NA))
    
    tot <- fo %>%
      group_by(faceoff_id) %>%
      summarise(total_elo = sum(prev_elo))
    
    fo1 <- fo %>%
      left_join(tot, by = c("faceoff_id")) %>%
      # uses a function that I created to calcuate the Elo Ratings using my set-up
      elo() %>%
      ungroup()
    # these values will get added into a list to provide starting values for a player's 
    # second and beyond faceoffs
  } else if (max(fo$faceoff_id, na.rm = TRUE) != 1) {
    # pull all the previous results for the instances where at 
    # least 1 player has had a previous faceoff
    # since they've had 1+ faceoff, they'll have an Elo rating
    prev_res <- dplyr::bind_rows(lst)
    
    fo1 <- fo %>%
      # add the starting Elo value to the player it belongs to
      left_join(prev_res %>%
                  ungroup() %>%
                  group_by(player) %>%
                  # only selecting the values for players in the current faceoff
                  dplyr::filter(player %in% fo$player) %>%
                  # only taking the most recent post-Elo rating for the players
                  # because there's no need to take an un-updated Elo rating
                  dplyr::filter(faceoff_played == 
                                  max(faceoff_played, na.rm = TRUE)) %>%
                  dplyr::rename("prev_elo" = "post_elo") %>%
                  dplyr::select(player, prev_elo),
                by = c("player")) %>%
      # since there are scenarios where one player in a combo has 1+ faceoff
      # and the other is making their "debut", there's a quick ifelse there to handle it
      dplyr::mutate(prev_elo = ifelse(faceoff_played == 1, 1500, prev_elo)) %>%
      # in retrospect, I didn't need the two separate portions of the loop
      # but since it's already written and works, I'm not changing it
      elo() %>%
      ungroup()
    
  }
  # store the results (and which faceoff # it was) in a list so that we can pull it later
  res <- fo1 %>%
    dplyr::select(player, post_elo, faceoff_played)
  
  lst[[i]] <- res 
  # message to stay up to date on where the loop is progress-wise becayse it's fun
  print(paste0("Just completed faceoff #", i, " out of ", max(data$faceoff_id, na.rm = TRUE), 
               ". Have completed ", round((i / max(data$faceoff_id, na.rm = TRUE)) * 100, 
                                         digits = 1), "% of the faceoffs."))
  
}

results <- dplyr::bind_rows(lst)

First, I ran the loop here on all the 2019-2020 play-by-play data that is available, producing Elo numbers for that season. For the 2020-2021 and 2021-2022 seasons, I wanted to make each players’ starting point for the season their ending rating from the prior year, rather than start them at 1500 again. Doing so required a slight re-write of the above loop, but the idea is the same. The only difference is that for a players’ first faceoff of a new season, their default is their ending value from the last season (assuming they have recorded a faceoff).


Results

I’ll give a little overview of some results, but the best place to check out the up to-date results would be here, on the R Shiny leaderboard that I threw together.

Something that I’m interested in following over the next few years is whether anyone can approach both the dominance and longevity that Dempsey has posted as a faceoff artist. From the games that we have play-by-play data on, the beginning of Dempsey’s run was a little bit bumpy, even trending below-average at times, but since about her 75th faceoff, she’s been nearly unbeatable on a consistent basis.

Mikyla Grant-Mentis and Shiann Darkangelo both began their careers more efficiently than Dempsey did, but have slowed down as they crossed the 100-career faceoff threshold. The player who pops the most is a former teammate of Dempsey’s, Tereza Vanisoza, who accumulated a 1952 Elo rating in just 116 PHF faceoffs. That’s insane efficiency, especially as a rookie in the PHF. Given that she’s only 25 and so good, Vanisova would be a phenomenal bet to challenge for Dempsey’s crown, but we may not see that happen as she’s moved to the SDHL for 2021-2022.


Beyond these rolling graphs, I’ve included a table of Faceoff Elo Ratings for the first few weeks of the 2021-2022 PHF season. Dempsey’s gap over the rest of the field is absurd; she’s on her own scale, to be honest.

Many of the names at the top of the table have accumulated a ton of faceoff opportunities over their career, but there are some relative newcomers making an appearance in the top 10.

Shiann Darkangelo, Mikyla Grant-Mentis, and Meghan Lorence are those notable names to me; it’ll be almost unfair if MGM becomes an excellent faceoff artist on top of everything else that she does (Mike said as much in a recent piece discussing PHF fantasy team-building trends).

(Again, would recommend using my R Shiny leaderboard to view the full results (over multiple seasons) and PHF Elo Ratings + a few other things I threw in)

