Evaluating passing skill in hockey is hard, especially when most of the data that you have to work with is just box score stats. Sure, primary assists, secondary assists, and turnovers are a decent place to start, but that’s a specific subset of all passes that occur in a hockey game. What about the passes that take place over the rest of the game?

Well, the lack of data has stood in the way of that sort of analysis for women’s hockey in recent years. But earlier this year, for the Big Data Cup, Stathletes released a dataset for the 2021 NWHL season that contained detailed tracking data, including pass events and both passer/receiver locations. With this data now available for the NWHL, it was unsurprising that many people made evaluating passing and movements a focus of their Big Data Cup projects, a list of which you can find here.

There were quite a few projects that focused on evaluating pass types by pass cluster, but for this project I was interested in testing out a couple of methods for evaluating individual players and their passing skill. You can find the full results and leader board at the bottom of this article (though some of the columns may not make sense if you don’t read the full article!) and the code can be found on my GitHub page.

The first method I test is building a Completion Percentage Over Expected (CPOE) model for the NWHL data and then using a Bayesian estimation to look at uncertainty in CPOE. While CPOE is not a new idea, I was inspired to use the Bayesian process by the tweet/project that is linked below which looked at CPOE for NFL Draft Prospects.

The other method that I explored is using a linear mixed-effects model with a random effect term for each player to see how each player’s completion percentage varied. Much of this was inspired by Dani Treisman’s Big Data Cup project, which used mixed effects for measuring how skill impacts xG, as well as previous research surrounding fumbles in the NFL (Michael Lopez) and catcher framing in baseball (Jonathan Judge, Harry Pavlidis, and Dan Brooks).

Let’s jump into it!


Completion Percentage Over Expected (CPOE)

Every attempted pass will either be completed (1) or incomplete (0). However, every single pass has a different likelihood of being completed. CPOE looks to see who completes more (or fewer) passes than what they’re expected to based on how difficult the pass attempt was. If a player is routinely completing difficult passes, that’s a good indication that they are a very good passer.

I used a Generalized Additive Model (GAM) from the mgcv package in R to model Expected Completion Percentage (xCP) for this project, using the passer location, intended receiver location, length of the pass, and the direction of the pass as inputs. These inputs are mostly dependent on the offense, though xCP surely depends on the defensive alignment, we don’t have that data available, so it’s not considered.

One item of note: while I took a random half of all the direct passes in the dataset to train the model, I did run the model on the entire dataset in order to include as many players/passes as possible in my results. This approach does mean that over fitting is a potential concern with our model, given the limited sample of passes (about 8000 recorded passes). (I removed indirect passes from the NWHL dataset because rimming a pass along the board has a different goal than a pass directed at another player.)

Once we run our Completion Percentage model on the NWHL data, we can calculate the CPOE for each pass (and player). As discussed previously, every pass is either complete (1) or incomplete (0). Since the GAM that we ran uses a binomial distribution, the values returned from the model are probabilities, ranging from 0 to 1. To calculate CPOE, take the Completion Percentage (CP) of the pass (0 or 1) and subtract the Expected Completion Percentage (xCP) from it.

For example, take a complete pass from Kaleigh Fratkin to McKenna Brand, which has an xCP of 0.64. Since the pass was completed, it has a CP of 1.00. Given the xCP of 0.64, the CPOE of that pass is 0.36. Repeat this process (or rather, write the code for it) and you get CPOE for every player. (The full leaderboard for CPOE is at the bottom of this article.)

Great!

We’ve calculated which players have completed more (or less) passes than we would expect, but how can we account for the uncertainty that comes from working with small samples? That’s an important question because CPOE tends to converge to 0.00 in our sample (check out the CPOE by Pass Attempts graph at the end of the second section to see this trend).

It’s clear that we should be cautious in making sweeping claims about the passing skill of players with just a few passes (and I’d argue that all the players should be viewed with caution because even Kaleigh Fratkin’s 245 passes isn’t that many). Obviously we can mentally add that asterisk to these figures, but I’d be happier if we put a number (read: confidence interval) on these figures.

That’s where the linked tweet and Bayesian estimation comes in.

Taking each player individually, the model examines each player’s distribution of CPOE and returns their mean CPOE, as well as the figures for a 95% confidence interval of where that player’s “true” CPOE lies. This interval is represented on the graphic below for the 44 NWHL players who recorded at least 75 attempted passes in 2021.

We have a fairly even split of positive CPOE and negative CPOE here, with some on the much higher and much lower ends of the spectrum. What I want to draw attention to is how the length of the interval is varied for every single player. On one hand, you’ve got a player like Kaleigh Fratkin, who has a the smallest interval, and on the other there’s Kristin Lewicki with a giant interval.

The graph with all 44 names on it can be kind of tough to read, so we can visualize this on a smaller scale too. Here I’ve pulled four names: Kaleigh Fratkin, McKenna Brand, Mikyla Grant-Mentis, and Sarah-Eve Coutu Godbout to represent this.

While there is overlap between every single player’s distribution of CPOE, we see that Fratkin has the highest peak of CPOE, while Coutu Godbout’s distribution is flatter. This indicates that there’s less uncertainty in our estimation of Fratkin’s CPOE (0.036 per pass) while we don’t know as much about Coutu Godbout “true CPOE talent”, which is -0.095 per pass in our sample.

(I chose these four names because they had distributions that I was looking for, but can easily run this for really any combination players. If you want to see a different set of players, hit me up on Twitter @benhowell71!)

The discrepancies in those intervals and distributions is largely due to how many passes have been observed for each player. For instance, Fratkin’s 245 passes are the most in our dataset, while Coutu Godbout’s 75 passes are right at our arbitrary cutoff. However, it’s not a 1:1 relationship between passes and spread (check out the CI Lengths vs Pass Attempts graph at the end of the second section to see this trend). Another factor that plays a big part in determining that 95% confidence interval is a player’s consistency.

For instance, if one player oscillates between successfully converting low xCP passes and turning over high xCP passes, they’ll be more volatile (larger interval) than the player who completes everything above an xCP of 0.5 and fails to complete anything below 0.5.


Mixed Model to account for passer effects

The second model that I tested is a mixed effects model with a random effect intercept. The random effect in the formula (the (1 | Player) term) has the model evaluate each player against their own baseline average, rather than relative to the overall population.

The major components of my mixed effects model are the same as the GAM I used earlier, with the sole addition of the random effect parameter (1 | Player). The pass start and intended location, direction, and length all remain in the model.

Once the model has been run, we can pull out the intercepts for each unique player to see who increases the likelihood of a successful pass the most (a positive intercept = better passer by this measure). As with our CPOE measure, the mixed effects model allows us to create confidence intervals for the effect of each player.

Now that we’ve gone through the two methods that I used to try and measure passing skill in the NWHL, let’s compare them! (There’s going to be some words here, but you can scroll down to find the full results and leader board at the end of the article.)

There turned out to be a strong relationship (R of 0.83) between CPOE rank and Passer Effect rank. In most cases, being a good passer by either CPOE or by Passer Effect means that a player is probably a good passer in the other. However, there are some…interesting cases and discrepancies between the two, most notably with Jonna Curtis, a forward with the Minnesota Whitecaps who had 96 pass attempts. It’s an interesting discrepancy and one that may be worth exploring more in-depth (or it could be a small sample quirk).

In contrast to our CPOE metric, the Passer Effect intercept does not converge as drastically to zero. I did find it intriguing that everyone with more than 125 direct passes increased the chance of a successful pass; this could imply that the better a passer you are, the more passes that you make. However, this could also be a quirk of our small sample size.

As with CPOE, there is a strong relationship between attempting more passes and having a smaller confidence interval on a player’s intercept.


Leaderboard and Full Results

Below is a full table that compares CP, xCP, CPOE, and Passer Effect for the 2021 NWHL skaters who attempted at least 75 passes in the 2021 season + a rundown of some of the results I found interesting.

Kaleigh Fratkin (the 2021 Defender of the Year) who had 9 assists, fared much better in the Passer Effect metric (ranking 3rd) while ranking 11th in CPOE. Her xCP has on the higher side and, for how high her xCP is, her CPOE is really good! It’s just hard to improve on already high xCP passes. The high Passer Effect does indicate that her proclivity to have high xCP passes could be because she’s an excellent passer!

Mikyla Grant-Mentis (the 2021 MVP) was above average in both CPOE and Passer Effect, but didn’t pop at the top of either leader board. McKenna Brand ranked 2nd in both CPOE and Passer Effect, which meshes well with Treisman’s findings on Brand leading the NWHL in his play-making ability metric.

NWHL CPOE (Completion Percentage Over Expected) and Passer Effects

Minimum 75 passes in 2021 season

---
title: "Measuring Passing Skill in the NWHL"
author: "Ben Howell"
date: "6/5/2021"
output:
  html_document:
    highlighter: null
    theme: "flatly"
    code_download: TRUE
    toc: true
    toc_float: true
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, 
                      warning = FALSE, 
                      message = FALSE,
                      fig.align = "center")
```

```{r Data Manipulation}
library(tidyverse)
library(mgcv)
library(lme4)
library(reactable)
library(bayesboot)
library(htmlwidgets)
library(htmltools)
library(tweetrmd)
library(patchwork)
library(RWordPress)
library(knitr)

#I'm not running this analysis on the Olympic data from the Big Data Cup because I wanted to focus on running this analysis
#for NWHL players/data

df <- read.csv("nwhl_data.csv")

df <- df %>%
  mutate(X.Coordinate = X.Coordinate - 100,
         Y.Coordinate = Y.Coordinate - 42.5,
         X.Coordinate.2 = X.Coordinate.2 - 100,
         Y.Coordinate.2 = Y.Coordinate.2 - 42.5) %>%
  separate(Clock, into = c("Minutes", "Seconds"), sep = ":", remove = FALSE) %>%
  mutate(Minutes = as.numeric(Minutes),
         Seconds = as.numeric(Seconds),
         advantage = ifelse(Team == Home.Team, Home.Team.Skaters - Away.Team.Skaters,
                            ifelse(Team == Away.Team, Away.Team.Skaters - Home.Team.Skaters, NA)),
         order = row_number())

store <- df %>%
  filter(Event %in% c("Faceoff Win", "Takeaway", "Puck Recovery")) %>%
  mutate(play_event = row_number()) 
# begin of a sequence of events for a team
# ignores the potential for dump and chase w/ a Puck Recovery
# mostly bc I didn't want to think about that
# it won't really impact this passing analysis bc we wouldn't count a Dump and Chase as an intended pass

df <- left_join(df, store)

df <- df %>%
  fill(play_event)

rm(store)
```

Evaluating passing skill in hockey is hard, especially when most of the data that you have to work with is just box score stats. Sure, primary assists, secondary assists, and turnovers are a decent place to start, but that's a specific subset of all passes that occur in a hockey game. What about the passes that take place over the rest of the game?             

Well, the lack of data has stood in the way of that sort of analysis for women's hockey in recent years. But earlier this year, for the Big Data Cup, Stathletes released a dataset for the 2021 NWHL season that contained detailed tracking data, including pass events and both passer/receiver locations. With this data now available for the NWHL, it was unsurprising that many people made evaluating passing and movements a focus of their Big Data Cup projects, a list of which you can find [here](https://www.theicegarden.com/2021/4/15/22374981/a-directory-of-womens-hockey-projects-from-big-data-cup-2021-analytics-otthac-stathletes).                        

There were quite a few projects that focused on evaluating pass types by pass cluster, but for this project I was interested in testing out a couple of methods for evaluating individual players and their passing skill. You can find the full results and leader board at the bottom of this article (though some of the columns may not make sense if you don't read the full article!) and the code can be found on [my GitHub page](https://github.com/benhowell71/NWHL).              

The first method I test is building a Completion Percentage Over Expected (CPOE) model for the NWHL data and then using a Bayesian estimation to look at uncertainty in CPOE. While CPOE is not a new idea, I was inspired to use the Bayesian process by the tweet/project that is linked below which looked at CPOE for NFL Draft Prospects.

```{r}
tweet_embed("https://twitter.com/mfbanalytics/status/1398352234883076097?s=20", align = "center")
```

The other method that I explored is using a linear mixed-effects model with a random effect term for each player to see how each player's completion percentage varied. Much of this was inspired by [Dani Treisman's Big Data Cup project](https://github.com/dtreisman/BigDataCup2021), which used mixed effects for measuring how skill impacts xG, as well as previous research surrounding [fumbles in the NFL](https://github.com/statsbylopez/NFL_Fumbles) (Michael Lopez) and [catcher framing](https://www.baseballprospectus.com/news/article/25514/moving-beyond-wowy-a-mixed-approach-to-measuring-catcher-framing/) in baseball (Jonathan Judge, Harry Pavlidis, and Dan Brooks).

Let's jump into it!

***********************************************************

# Completion Percentage Over Expected (CPOE)

Every attempted pass will either be completed (1) or incomplete (0). However, every single pass has a different likelihood of being completed. CPOE looks to see who completes more (or fewer) passes than what they're expected to based on how difficult the pass attempt was. If a player is routinely completing difficult passes, that's a good indication that they are a very good passer.              

I used a Generalized Additive Model (GAM) from the `mgcv` package in R to model Expected Completion Percentage (xCP) for this project, using the passer location, intended receiver location, length of the pass, and the direction of the pass as inputs. These inputs are mostly dependent on the offense, though xCP surely depends on the defensive alignment, we don't have that data available, so it's not considered.                  

*One item of note: while I took a random half of all the direct passes in the dataset to train the model, I did run the model on the entire dataset in order to include as many players/passes as possible in my results. This approach does mean that over fitting is a potential concern with our model, given the limited sample of passes (about 8000 recorded passes). (I removed indirect passes from the NWHL dataset because rimming a pass along the board has a different goal than a pass directed at another player.)*                     

Once we run our Completion Percentage model on the NWHL data, we can calculate the CPOE for each pass (and player). As discussed previously, every pass is either complete (1) or incomplete (0). Since the GAM that we ran uses a binomial distribution, the values returned from the model are probabilities, ranging from 0 to 1. To calculate CPOE, take the Completion Percentage (CP) of the pass (0 or 1) and subtract the Expected Completion Percentage (xCP) from it.       

For example, take a complete pass from Kaleigh Fratkin to McKenna Brand, which has an xCP of 0.64. Since the pass was completed, it has a CP of 1.00. Given the xCP of 0.64, the CPOE of that pass is 0.36. Repeat this process (or rather, write the code for it) and you get CPOE for every player. (The full leaderboard for CPOE is at the bottom of this article.)

```{r}
# Minute <- 0:20
# min_of_game <- 20:0
# 
# time <- data.frame(Minute, min_of_game)
# code that I pulled from my BDC project when I copied some code over (it was used for time left in game type stuff
# we're not particularly concerned with it this time around

passes <- df %>%
  filter(Event %in% c("Play", "Incomplete Play") & Detail.1 == "Direct") %>%
  #filter out all direct pass events bc we can consider them an intended pass
  #I have a hard time coming up w/ a reason why indirect passes should be included a CPOE model since they're less of an intended pass 
  mutate(X.Coordinate = ifelse(X.Coordinate == X.Coordinate.2, X.Coordinate + 0.001, X.Coordinate),
         #the above line is basically so that we can come up w/ a direction/length of a pass that goes sideways
         Direction = (Y.Coordinate.2 - Y.Coordinate) / (X.Coordinate.2 - X.Coordinate),
         
         Length = sqrt(((X.Coordinate.2 - X.Coordinate) ^ 2) + ((Y.Coordinate.2 - Y.Coordinate) ^ 2)),
         success = ifelse(Event == "Play", 1, 
                          ifelse(Event  == "Incomplete Play", 0, NA)))

#mean(passes$success)
#should be about 0.69 (nice!)
```

```{r}
#separating the data into a train and test set
#although, since the dataset is so small, I take this random half of data to train the model
#then we run the data on the overall data set, which includes the training dataset
#the joys of limited data!
set.seed(565)
dt <- sort(sample(nrow(passes), nrow(passes)*.5))

train <- passes[dt, ]
test <- passes[-dt, ]
```

```{r}
pass_model <- gam(success ~ s(X.Coordinate, Y.Coordinate) + s(X.Coordinate.2, Y.Coordinate.2) + Direction + Length,
                  data = train, family = "binomial")
#running the model
#make sure that 'family = binomial' so that we return values between 0 and 1 as a percentage
#summary(pass_model)

passes$xCOMP <- predict.gam(pass_model, newdata = passes, type = "response")
#summary(passes$xCOMP)
#to check what the summary of the returned values are
```

```{r}
passes <- passes %>%
  mutate(difference = success - xCOMP)

lb <- passes %>%
  group_by(Player) %>%
  summarise(n = n(),
            CP = round(mean(success, na.rm = TRUE), digits = 2),
            xCP = round(mean(xCOMP), digits = 2),
            CPOE = round(mean(difference), digits = 3))

#mean(passes$xCOMP)
#the average xCP is 0.6971
#which is v v close to the raw CP%
#which is a good sign
```

```{r}
#weighted.mean(lb$xCP, w = lb$n)
```

```{r}
p1 <- lb %>%
  filter(n >= 25) %>%
  ggplot() +
  geom_point(aes(x = n, y = CPOE)) +
  labs(x = "Passes", title = "CPOE by Pass Attempts") +
  geom_hline(yintercept = 0, linetype = "dashed") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 12, face = "bold"))

#hist(lb$CPOE)

graphic <- lb %>%
  filter(n >= 75)
```

```{r}
min <- min(graphic$CPOE)
max <- max(graphic$CPOE)

make_color_pal <- function(colors, bias = 1) {
  get_color <- colorRamp(colors, bias = bias)
  function(x) rgb(get_color(x), maxColorValue = 255)
}

good_color <- make_color_pal(c("#35b0ab", "#93d3ab", "#c9ecb4", "#f2fbd2", "#ffffff"), bias = 2)
velo_color <- make_color_pal(c("#B7E9F7", "#7AD7F0", "#FFFEFE", "#DB7A7B", "#BB0103"), bias = 2)

color <- make_color_pal(c("#c62f2d", "#cc7a64", "#f7cec3", "#f7e4df", "#FFFEFE", "#5be04f", "#388223"), bias = 2)

tbl <- graphic %>% 
  filter(n >= 75) %>% 
  arrange(desc(CPOE)) %>%
  reactable(
    pagination = FALSE,
    highlight = TRUE,
    striped = TRUE,
    defaultColDef = colDef(
    align = "center", minWidth = 25, headerClass = "header", headerStyle = list(fontWeight = 700)
  ),
  columnGroups = list(
    colGroup(name = "Results", columns = c("CP", "xCP", "CPOE"))
  ),
  theme = reactableTheme(
    headerStyle = list(
      "&:hover[aria-sort]" = list(background = "hsl(0, 0%, 96%)"),
      "&[aria-sort='ascending'], &[aria-sort='descending']" = list(background = "hsl(0, 0%, 96%)"),
      borderColor = "#555"
    )),
  columns = list(
    n = colDef(
      name = "Passes",
      class = "border-left"
    ),
    CPOE = colDef(
      name = "CPOE",
      style = function(value) {
        value
        normalized <- (value - min) / (max - min)
        #normalized <- (value - min(pitch_type$RV100)) / (max(pitch_type$RV100) - min(pitch_type$RV100))
        color <- color(normalized)
        list(background = color)
      }
    ),
    Player = colDef(
      name = "Player", width = 175, resizable = FALSE
    )
  )
)
# 
# div(class = "title",
#   h2("NWHL CPOE (Completion Percentage Over Expected)"),
#   tbl)
```

```{css}
.title {
  margin: 18px 0px;
  font-size: 16px;
}

.border-left {
  border-left: 2px solid #555;
}

.title h2 {
  font-size: 20px;
  font-weight: 600;
}

.subtitle {
  margin: 18px 0px;
  font-size: 3px;
}

.subtitle h2 {
  font-size: 14px;
  #font-weight: 100;
}

/* Align header text to the bottom */
.header,
.group-header {
  display: flex;
  flex-direction: column;
  justify-content: flex-end;
}

.header {
  border-bottom-color: #555;
  font-size: 13px;
  font-weight: 400;
  #text-transform: uppercase;
}

/* Highlight headers when sorting */
.header:hover,
.header[aria-sort="ascending"],
.header[aria-sort="descending"] {
  background-color: #eee;
}

.standings {
  font-family: Karla, "Helvetica Neue", Helvetica, Arial, sans-serif;
  font-size: 14px;
}
```

```{r}
# div(class = "title",
#   h2("NWHL CPOE (Completion Percentage Over Expected)"),
#   "Minimum 75 passes in 2021 season",
#   tbl)

# div(
#   class = "standings",
#   div(class = "title",
#     h2("NWHL CPOE (Completion Percentage Over Expected)")),
#   div(class = "subtitle",
#     h4("Minimum 75 passes in 2021 season")
#   ),
#   tbl
# )
```

Great!                 

We've calculated which players have completed more (or less) passes than we would expect, but how can we account for the uncertainty that comes from working with small samples? That's an important question because CPOE tends to converge to 0.00 in our sample (check out the *CPOE by Pass Attempts* graph at the end of the second section to see this trend).                 

It's clear that we should be cautious in making sweeping claims about the passing skill of players with just a few passes (and I'd argue that all the players should be viewed with caution because even Kaleigh Fratkin's 245 passes isn't *that* many). Obviously we can mentally add that asterisk to these figures, but I'd be happier if we put a number (read: confidence interval) on these figures.                 

That's where the linked tweet and Bayesian estimation comes in.                   

Taking each player individually, the model examines each player's distribution of CPOE and returns their mean CPOE, as well as the figures for a 95% confidence interval of where that player's "true" CPOE lies. This interval is represented on the graphic below for the 44 NWHL players who recorded at least 75 attempted passes in 2021.

```{r, fig.height=9, fig.width=9}
names <- graphic$Player

data <- passes %>%
  filter(Player %in% names)

lst <- list()

for (qb in names) {
  df_play <- data %>%
    filter(Player == qb)
  x <- bayesboot(as.vector(df_play$difference), mean)
  s <- summary(x)
  mean_cpoe <- s$value[1]
  lci <- s$value[5]
  uci <- s$value[9]
  df_bayes <- data.frame("mean_cpoe" = mean_cpoe, "LCI" = lci, "UCI" = uci)
  lst[[qb]] <- df_bayes
}

df_bayes <- dplyr::bind_rows(lst)
df_bayes$Player <- names
df_bayes <- df_bayes %>% 
  arrange(mean_cpoe)

df_bayes %>%
  ggplot(aes(x = factor(Player, level = Player), y = mean_cpoe)) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  geom_pointrange(aes(ymin = LCI, ymax = UCI, color = mean_cpoe), size = 1.6) +
  labs(x = "Player", title = "Bayesian Estimation for 2021 NWHL Passer CPOE",
       y = "Mean CPOE", subtitle = "CPOE: Completion Percentage Over Expected",
       caption = "Ben Howell | @benhowell71 | benhowell71.com",
       label = "Mean CPOE") +
  coord_flip() +
  scale_color_gradient2(low = ("blue"), mid = "grey", midpoint = 0, high = ("red"),
                        name = "Mean CPOE") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 15, face = "bold"),
        axis.title.y = element_blank(),
        plot.subtitle = element_text(hjust = 0.5, size = 12),
        plot.caption = element_text(hjust = 0.5, size = 9),
        axis.text.y = element_text(size = 13))
```

We have a fairly even split of positive CPOE and negative CPOE here, with some on the much higher and much lower ends of the spectrum. What I want to draw attention to is how the length of the interval is varied for every single player. On one hand, you've got a player like Kaleigh Fratkin, who has a the smallest interval, and on the other there's Kristin Lewicki with a giant interval.                       

The graph with all 44 names on it can be kind of tough to read, so we can visualize this on a smaller scale too. Here I've pulled four names: Kaleigh Fratkin, McKenna Brand, Mikyla Grant-Mentis, and Sarah-Eve Coutu Godbout to represent this.                     

While there is overlap between every single player's distribution of CPOE, we see that Fratkin has the highest peak of CPOE, while Coutu Godbout's distribution is flatter. This indicates that there's less uncertainty in our estimation of Fratkin's CPOE (0.036 per pass) while we don't know as much about Coutu Godbout "true CPOE talent", which is -0.095 per pass in our sample. 

(I chose these four names because they had distributions that I was looking for, but can easily run this for really any combination players. If you want to see a different set of players, hit me up on Twitter @benhowell71!)

```{r}
lst2 <- list()
hockey <- c("Kaleigh Fratkin", "McKenna Brand", "Mikyla Grant-Mentis", "Sarah-Eve Coutu Godbout")
#can change out the names here to look at the distributions for diferent players

for (qb in hockey) {
  df_play <- data %>% 
    filter(Player == qb)
  x <- bayesboot(as.vector(df_play$difference), mean)
  df2 <- data.frame('estimate' = x$V1, 'Player' = qb)
  lst2[[qb]] <- df2
}

df2 <- dplyr::bind_rows(lst2)

df2 %>%
  ggplot(aes(x = estimate)) +
  geom_density(aes(fill = Player), alpha = 0.6, outline.type = "both") +
  #geom_vline(aes(xintercept = mean(estimate), color = Pitcher))
  theme_minimal() +
  labs(x = "Estimated CPOE", title = "Bayesian Estimation of NWHL CPOE (2021)",
       caption = "Ben Howell | @benhowell71 | benhowell71.com", y = "",
       subtitle = "CPOE: Completion Percentage Over Expected") +
  geom_vline(xintercept = 0, linetype = "dashed") +
  theme(panel.grid.major = element_blank(),
        plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
        #axis.title.y = element_blank(),
        legend.position = "bottom",
        plot.caption = element_text(hjust = 0.5, size = 8),
        plot.subtitle = element_text(hjust = 0.5, size = 10))
#worth noting that SECG has the widest spread among these players
#if we look at the 'graphic' data.frame she only has 75 observations which is tied for the fewest in this set
#(since we set 75 passes as the minimum)
#so naturally she should have the most uncertainty in her CPOE bc were estimating it from the smallest set
#Kaleigh Fratkin on the other hand easily has the most observed passes so she has the highest peak and some of the least uncertainty
#in her distribution
```

The discrepancies in those intervals and distributions is largely due to how many passes have been observed for each player. For instance, Fratkin's 245 passes are the most in our dataset, while Coutu Godbout's 75 passes are right at our arbitrary cutoff. However, it's not a 1:1 relationship between passes and spread (check out the *CI Lengths vs Pass Attempts* graph at the end of the second section to see this trend). Another factor that plays a big part in determining that 95% confidence interval is a player's consistency.                  

For instance, if one player oscillates between successfully converting low xCP passes and turning over high xCP passes, they'll be more volatile (larger interval) than the player who completes everything above an xCP of 0.5 and fails to complete anything below 0.5.

```{r}
all_data <- right_join(graphic, df_bayes, by = "Player")
all_data <- all_data %>%
  mutate(spread = UCI - LCI)

p2 <- all_data %>%
  ggplot() +
  geom_point(aes(x = n, y = spread)) +
  labs(x = "Passes", y = "Length of CI", title = "CI Length vs Pass Attempts") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 12, face = "bold"))

layout <- (p1 + p2)
#layout
#there is a clear relationship bt having more passes and less uncertainty in the estimated CPOE
```

********************************************************

# Mixed Model to account for passer effects

The second model that I tested is a mixed effects model with a random effect intercept. The random effect in the formula (the (1 | Player) term) has the model evaluate each player against their own baseline average, rather than relative to the overall population.                    

The major components of my mixed effects model are the same as the GAM I used earlier, with the sole addition of the random effect parameter (1 | Player). The pass start and intended location, direction, and length all remain in the model.                      

Once the model has been run, we can pull out the intercepts for each unique player to see who increases the likelihood of a successful pass the most (a positive intercept = better passer by this measure). As with our CPOE measure, the mixed effects model allows us to create confidence intervals for the effect of each player.

```{r}
set.seed(333)
pass_effect <- glmer(success ~ (1 | Player) + X.Coordinate + Y.Coordinate + X.Coordinate.2 + Y.Coordinate.2 +
                      Direction + Length, data = passes, family = "binomial")
#summary(pass_effect)
```

```{r}
randoms <- ranef(pass_effect, condVar = TRUE)$Player
qq <- attr(ranef(pass_effect, condVar = TRUE)$Player, "postVar") 
rand.intercept <- randoms[, 1]

pass_randefs <- data.frame(re_Intercepts = randoms[, 1],
                     re_sd.interc = 2*sqrt(qq[,, 1:length(qq)]),
                     Names = rownames(randoms))

pass_randefs$Names <- factor(pass_randefs$Names, levels = pass_randefs$Names[order(pass_randefs$re_Intercepts)])
pass_randefs <- pass_randefs[order(pass_randefs$re_Intercepts),]

pass_randefs <- pass_randefs %>%
  mutate(re_min = re_Intercepts - re_sd.interc,
         re_max = re_Intercepts + re_sd.interc)
```

```{r}
min_pe <- pass_randefs %>%
  filter(Names %in% names)

all_data <- right_join(all_data, min_pe, by = c("Player" = "Names"))

p3 <- all_data %>%
  ggplot() +
  geom_point(aes(x = n, y = re_sd.interc)) +
  labs(x = "Passes", y = "Intercept CI", title = "Intercept CI vs Pass Attempts") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 12, face = "bold"))
#again, strong relationship bt more observations and lower standard deviations for the effect of a passer

p4 <- all_data %>%
  ggplot() +
  geom_point(aes(x = n, y = re_Intercepts)) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(x = "Passes", y = "Intercept", title = "Passer Effect vs Pass Attempts") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 12, face = "bold"))

l2 <- (p4 + p3)
```

```{r, fig.height=9, fig.width=9}
all_data %>%
  arrange(re_Intercepts) %>%
  ggplot(aes(x = factor(Player, level = Player), y = re_Intercepts)) +
  geom_pointrange(aes(ymin = re_min, ymax = re_max, color = re_Intercepts), size = 1.6) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  coord_flip() +
  labs(x = "Player", title = "Measuring the Effect of NWHL Passers on Pass Success",
       y = "Random Effect Intercept", subtitle = "Positive Intercept means Passer Increases Success%",
       caption = "Ben Howell | @benhowell71 | benhowell71.com",
       label = "Effect") +
  coord_flip() +
  scale_color_gradient2(low = ("blue"), mid = "grey", midpoint = 0, high = ("red"),
                        name = "Effect") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 15, face = "bold"),
        axis.title.y = element_blank(),
        plot.subtitle = element_text(hjust = 0.5, size = 12),
        plot.caption = element_text(hjust = 0.5, size = 9),
        axis.text.y = element_text(size = 13))
```

Now that we've gone through the two methods that I used to try and measure passing skill in the NWHL, let's compare them! (There's going to be some words here, but you can scroll down to find the full results and leader board at the end of the article.)                     

There turned out to be a strong relationship (R of 0.83) between CPOE rank and Passer Effect rank. In most cases, being a good passer by either CPOE or by Passer Effect means that a player is probably a good passer in the other. However, there are some...interesting cases and discrepancies between the two, most notably with Jonna Curtis, a forward with the Minnesota Whitecaps who had 96 pass attempts. It's an interesting discrepancy and one that may be worth exploring more in-depth (or it could be a small sample quirk). 

```{r}
all_data <- all_data %>%
  arrange(desc(mean_cpoe)) %>%
  mutate(mean_cpoe_rank = row_number()) %>%
  arrange(desc(re_Intercepts)) %>%
  mutate(intercept_rank = row_number())

val <- round(cor(all_data$mean_cpoe_rank, all_data$intercept_rank), digits = 2)
#strong relationship between mean CPOE and the passer effect

all_data %>%
  ggplot() +
  geom_point(aes(x = mean_cpoe_rank, y = intercept_rank, color = mean_cpoe), size = 4) +
  labs(x = "Mean CPOE Rank", title = "Comparing Mean CPOE and Passer Effect Ranks",
       y = "Random Effect Intercept Rank",
       caption = "Ben Howell | @benhowell71 | benhowell71.com") +
  geom_abline(slope = 1, linetype = "dashed") +
  geom_text(aes(x = 6, y = 35, label = paste0("Cor: ", val)), size = 5, fontface = "italic") +
  scale_color_gradient2(low = ("blue"), mid = "grey", midpoint = 0, high = ("red"),
                        name = "Mean CPOE") +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 15, face = "bold"),
        plot.subtitle = element_text(hjust = 0.5, size = 12),
        plot.caption = element_text(hjust = 0.5, size = 9))
```

In contrast to our CPOE metric, the Passer Effect intercept does not converge as drastically to zero. I did find it intriguing that everyone with more than 125 direct passes increased the chance of a successful pass; this could imply that the better a passer you are, the more passes that you make. However, this could also be a quirk of our small sample size.                  

As with CPOE, there is a strong relationship between attempting more passes and having a smaller confidence interval on a player's intercept.              

```{r}
test <- all_data %>% 
  filter(n >= 75)

min2 <- min(test$re_Intercepts)
max2 <- max(test$re_Intercepts)

tbl2 <- all_data %>% 
  filter(n >= 75) %>% 
  arrange(desc(CPOE)) %>%
  mutate(re_Intercepts = round(re_Intercepts, digits = 3)) %>%
  dplyr::select(Player, n, CP, xCP, CPOE, re_Intercepts) %>%
  reactable(
    pagination = FALSE,
    highlight = TRUE,
    striped = TRUE,
    defaultColDef = colDef(
    align = "center", minWidth = 25, headerClass = "header", headerStyle = list(fontWeight = 700)
  ),
  columnGroups = list(
    colGroup(name = "Results", columns = c("CP", "xCP", "CPOE", "re_Intercepts"))
  ),
  theme = reactableTheme(
    headerStyle = list(
      "&:hover[aria-sort]" = list(background = "hsl(0, 0%, 96%)"),
      "&[aria-sort='ascending'], &[aria-sort='descending']" = list(background = "hsl(0, 0%, 96%)"),
      borderColor = "#555"
    )),
  columns = list(
    n = colDef(
      name = "Passes",
      class = "border-left"
    ),
    re_Intercepts = colDef(
      name = "Passer Effect",
      style = function(value) {
        value
        normalized <- (value - min2) / (max2 - min2)
        #normalized <- (value - min(pitch_type$RV100)) / (max(pitch_type$RV100) - min(pitch_type$RV100))
        color <- color(normalized)
        list(background = color)
      }
    ),
    CPOE = colDef(
      name = "CPOE",
      style = function(value) {
        value
        normalized <- (value - min) / (max - min)
        #normalized <- (value - min(pitch_type$RV100)) / (max(pitch_type$RV100) - min(pitch_type$RV100))
        color <- color(normalized)
        list(background = color)
      }
    ),
    Player = colDef(
      name = "Player", width = 175, resizable = FALSE
    )
  )
)
```

```{r}
layout / l2
```

*******************************************

# Leaderboard and Full Results

Below is a full table that compares CP, xCP, CPOE, and Passer Effect for the 2021 NWHL skaters who attempted at least 75 passes in the 2021 season + a rundown of some of the results I found interesting.              

Kaleigh Fratkin (the 2021 Defender of the Year) who had 9 assists, fared much better in the Passer Effect metric (ranking 3rd) while ranking 11th in CPOE. Her xCP has on the higher side and, for how high her xCP is, her CPOE is really good! It's just hard to improve on already high xCP passes. The high Passer Effect does indicate that her proclivity to have high xCP passes could be because she's an excellent passer!                 

Mikyla Grant-Mentis (the 2021 MVP) was above average in both CPOE and Passer Effect, but didn't pop at the top of either leader board. McKenna Brand ranked 2nd in both CPOE and Passer Effect, which meshes well with Treisman's findings on Brand leading the NWHL in his play-making ability metric.

```{r}
div(
  class = "standings",
  div(class = "title",
    h2("NWHL CPOE (Completion Percentage Over Expected) and Passer Effects")),
  div(class = "subtitle",
    h4("Minimum 75 passes in 2021 season")
  ),
  tbl2
)
```

```{r}
# options(WordpressLogin = c("BenHowell" = 'Naomi123()'),
#         WordpressURL = 'http://benhowell71.com/xmlrpc.php')
# 
# knit2wp('nwhl_cpoe.Rmd', title = 'Measuring Passing Skill in the NWHL', publish = FALSE,
#         action = "newPost")
```

