Ben Howell http://benhowell71.com Student at the University of Texas Tue, 08 Nov 2022 21:24:01 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.2 181209194 Game Score in the PHF: Individual Player Reports http://benhowell71.com/game-score-in-the-phf-individual-player-reports/?utm_source=rss&utm_medium=rss&utm_campaign=game-score-in-the-phf-individual-player-reports Tue, 08 Nov 2022 21:23:58 +0000 https://benhowell71.com/?p=614 With the 2022-2023 PHF season kicking off last weekend, I wanted to provide more analytical content about the PHF and its games/players. The first major step in that direction are my new (and pretty!) game score-by-player reports for every team (and player).

I posted the reports from Sunday’s games to my Twitter since I didn’t finish creating them until last night (Monday evening). Going forward, these reports will be a part of my automated reports/tweeting for each game, along with shot pressure charts for each team. This is going to make them incredibly easy to access and find and I look forward to them helping discussions around each game and its players.

While the charts are just gorgeous, I happen to think they’re also very informative, but it’s important that people interpret these in context. To that end, I’ve done a quick writeup of Game Score and where it stands right now, as well as the changes that I do plan on implementing moving forward. You can find that writeup below.

]]>
614
Measuring Faceoff Skill in the PHF: Elo Ratings http://benhowell71.com/phf-faceoff-elo/?utm_source=rss&utm_medium=rss&utm_campaign=phf-faceoff-elo Wed, 17 Nov 2021 05:34:41 +0000 http://benhowell71.com/?p=581 Coming from a non-hockey background, faceoffs are an aspect of hockey that I haven’t thought much about. However, a recent Mike Murphy (@DigDeepBSB) tweet inspired me to look at faceoffs a little closer.

I don’t want to spoil my article on measuring faceoffs, but it turns out that Jillian Dempsey is really fricking good. You can check out the article below to read about my process and find the results from my methods. (You can also view the results in a more interactive visual format on this Shiny app).

You can share your thoughts with me on Twitter @benhowell71 and let me know if you have questions about the project or the 2022 projections.

]]>
581
Aging Curves in Women’s Hockey: NWHL Edition http://benhowell71.com/aging-curves-in-womens-hockey-nwhl-edition/?utm_source=rss&utm_medium=rss&utm_campaign=aging-curves-in-womens-hockey-nwhl-edition Sun, 18 Jul 2021 21:49:14 +0000 http://benhowell71.com/?p=571 At the 2021 WHKYHAC Conference, I presented research on exploring the aging curve in the NWHL. In addition, I put together a write-up of my analysis and work process, which you can find below. The code and data for this project can be found on my GitHub page.

If you’re just looking for the 2022 NWHL Game Score per Game projections that I talked about in my presentation, they are at the bottom of the below work.

You can share your thoughts with me on Twitter @benhowell71 and let me know if you have questions about the project or the 2022 projections.

]]>
571
Measuring Passing Skill in the NWHL http://benhowell71.com/measuring-passing-skill-in-the-nwhl/?utm_source=rss&utm_medium=rss&utm_campaign=measuring-passing-skill-in-the-nwhl Sat, 05 Jun 2021 21:41:15 +0000 http://benhowell71.com/?p=550 Using the data from the 2021 NWHL season that was made public for the Big Data Cup, I took at stab at quantifying passing skill in the NWHL using two types of measures. You can find the code for this project on my GitHub page and check out the the article below by clicking on the link below.

You can share your thoughts with me on Twitter @benhowell71 or you can reach out if you want to see different players featured in a CPOE distribution graphic!

]]>
550
Modeling Plate Discipline from the KBO to MLB http://benhowell71.com/modeling-plate-discipline-from-the-kbo-to-mlb/?utm_source=rss&utm_medium=rss&utm_campaign=modeling-plate-discipline-from-the-kbo-to-mlb Thu, 11 Mar 2021 03:00:30 +0000 http://benhowell71.com/?p=518 Plate discipline (BB%, K%, and BB/K) is one the biggest concerns for hitters making the transition from the KBO to MLB. Using data charted for the KBO Wizard, I explored the question of projecting plate discipline from the KBO to MLB using Swing%, Contact%, and similar approach metrics, using Kim Ha-seong and Na Sung-bum‘s player profiles to explore the question.

I presented this project and my findings at the 2021 SABR Analytics Conference on March 12th, 2021; you can view the video presentation on YouTube.

To my knowledge, this is the first such analysis to approach this problem using swing-level metrics, a departure from using KBO BB%/K% and applying a competition level adjustment as most projection systems do.

(Disclaimer: I will be joining the San Diego Padres as a Baseball Research and Development Intern this summer. The San Diego Padres signed Kim Ha-seong, who is discussed at length in this paper, to a 4-year deal this offseason. I would like to stress that data and work contained in this project is mine and is in no way representative of the San Diego Padres or their thought process surrounding Kim Ha-seong.)

Modeling-Plate-Discipline-from-the-KBO-to-MLB

]]>
518
How Do We Get There: Quantifying Pass Types and Their Value http://benhowell71.com/big-data-cup-submission/?utm_source=rss&utm_medium=rss&utm_campaign=big-data-cup-submission Fri, 05 Mar 2021 04:38:33 +0000 http://benhowell71.com/?p=502 Below is my submission for the 2021 Big Data Cup, hosted by Stathletes. I focused on developing a framework for measuring the value of player shots and movements, including the actions that occur before a potential assist. I outline my process for developing Expected Attack Value (xAV) below and explore some trends among common pass types.

My submission placed second in the College Division of the Big Data Cup in 2021; you can find my video presentation on YouTube from when I presented it at the Ottawa Hockey Analytics Conference in March 2021.

You can find an accompanying R Shiny App (for an Interactive Movement% Plot and full xAV Leaderboard) here. The code and paper for this project are also located on my GitHub page.

Ben-Howell-Big-Data-Cup-Paper

]]>
502
WNBA Free Agency and Player Archetypes http://benhowell71.com/wnba-free-agency-and-player-archetypes/?utm_source=rss&utm_medium=rss&utm_campaign=wnba-free-agency-and-player-archetypes Wed, 27 Jan 2021 17:23:22 +0000 http://benhowell71.com/?p=459 WNBA Free Agency kicks fully into gear on February 1st when teams and players are able to officially able to sign deals. While there hasn’t been much action (yet), it sounds like things are brewing behind the scenes.

There are quite a few stars who received core designations (Liz Cambage, Nneka Ogwumike, and Natasha Howard), plus an interesting list of reserved and restricted free agents. But among the unrestricted free agents, numerous WNBA superstars are eligible to sign with any team who will have them.

Diani Taurasi, Sue Bird, Candace Parker, Alyssa Thomas, Emma Meesseman, Kayla McBride, Eric Wheeler. The list goes on and on, making it a distinct possibility that we see multiple seismic shifts in the WNBA landscape this offseason.

We know that the WNBA is dominated by tall, athletic post players, like Elena Delle Donne, Cambage, etc. The NBA has moved into an era of “position-less” basketball, an era where LeBron James is a point guard and PJ Tucker is a center (sometimes). This has necessitated a shift from the PG-SG-SF-PF-C lineup and requires that we group players by their skill and usage, rather than how tall they are.

The WNBA hasn’t seen a shift that drastic (yet), but they still need a diversified approach to the game. Running out an All-Star lineup of Liz Cambage, Candace Parker, Elena Delle Donne, Breanna Stewart, and A’ja Wilson probably isn’t the best possible WNBA lineup (actually, on second thought, it might be since each of those players is a freaking stud). You want to complement those kinds of players with your Taurasi and Bird types who can help space the floor.

But how do we know who fits into which role? Well, that’s why I created a app that allows you to compare players in 3D, as well as a clustering method I’m going to walk through in this article. I would be remis if I didn’t mention that this project was heavily inspired by Todd Whitehead’s NBA lineup work and Alex Stern’s NBA Clustering project which introduced me to some of these concepts.

You can find the WNBA Clustering App here.

On this WNBA App, there are a variety of advanced stats, pulled from basketball-reference from the 2016-2020 season to create this app and these clusters. You can find a full breakdown of the stats available on the WNBA App. On the app, you can scroll over the bubbles and see pertinent information as to which player you are viewing.

The App contains an offensive and defensive leaderboard of these advanced stats plus one more tab, titled Similarity Scores. This is where you can see which seasons and players have similar play styles. Over the rest of this article, I’m going to break down the process of creating these scores and my clustering approach. You can find the code and data that I used for this project here.

I used two different clustering methods on this problem, which returned similar results. In the end, I opted for the Gaussian Mixture Model, which determines the optimal number of groups on its own, in this case, 6 different player archetypes. (When I first ran this model, I included Win Shares in the model, which is essentially a measure of how much a player contributes to winning. The mixture model returned three groups, a “good” group, an “average” group, and a “bad” group. That’s so helpful. Anyway, I removed those variables and got better results).

Results from my Mixture Model which returned six different clusters

Here are the most representative seasons by cluster (note, not the BEST seasons per cluster, but the one that is most representative); some pretty clear patterns emerge.

  • Cluster 1: Elena Delle Donne 2019 (and 2017 and 2016), Breanna Stewart 2018, Brittney Griner 2019
  • Cluster 2: Karima Christmas-Kelly 2016 (and 2017), Tierra Ruffin-Pratt 2020, DeWanna Bonner 2016
  • Cluster 3: Kiah Stokes 2017 (and 2016), Rachel Hollivay 2016, and Alaina Coates 2018
  • Cluster 4: Shekinna Stricklen (2016-2020, literally all in the top 9, she IS Cluster 4), Sydney Wiese 2017, Jordan Hooper 2017
  • Cluster 5: Marie Gulich 2019, Amanda Zahui B. 2020, Bella Alarie 2020
  • Cluster 6: Courtney Vandersloot 2020 (and 2017-2019, literally the top 4 seasons), Sue Bird 2018, Lindsay Allen 2017
Here the averages of each stat is represented for each cluster

What can this visualization (and these seasons) tell us about the cluster designations?

Cluster 1 is the superstar range. The players who play stellar defense, grab rebounds, and shoot incredibly efficiently within the paint,while also managing to distribute on offense. Some of the players here shoot the three-ball, but it’s not a requirement.

Cluster 2 is home to players who are about average. These are primarily wing players who can drive to the hoop (above-average Free-throw rate), but don’t do anything that pops off the page.

Cluster 3 is basically a low-budget superstar. They don’t create for others (Assist%) that the stars do, but still provide decent defense, good rebounding, and paint play.

Cluster 4 players shoot a lot of three-pointers and are a nuisance on defense as a steals threat. That’s it.

Cluster 5 players are interesting, a combination of post play with a little bit of three-point range. Amanda Zahui B. in particular popped out to me, as the default app setting is 3PAr, BLKpct, and USGpct, putting Zahui as one of the only players who does a bit of all that.

Cluster 6, or as I like to call it, the Point God Cluster, is home to the best of the best floor generals in the WNBA. A decent amount of three-point shooting, rarely turns the ball over, and a TON of assists.

When you look at this page on the WNBA App, the lower the Uncertainty, the more that player fits in their designated cluster. Using this clustering system, there are a few particularly notable free agents (using their 2020 designation):

  • Both Cheyenne Parker and Candace Parker were Cluster 1 players in 2020 and could subsequently be in line for big pay-days.
  • Natalie Achonwa was a Cluster 3 player with the Fever. Could she make the jump to Cluster 1 in a new role?
  • Jasmine Thomas was solidly in Cluster 6 thanks to her 2020 season; maybe a rebuilding team makes her a focal point in their offense if she’s not retained by the Sun

This is intended to be an initial look into the world of classifying WNBA players by their tendencies. The WNBA App is an interactive way for fans and analysts to explore player archetypes in 3D to provide a well rounded look at each player.

]]>
459
The Best Pitch in the KBO: Introducing KBO Run Values http://benhowell71.com/kbo-run-values/?utm_source=rss&utm_medium=rss&utm_campaign=kbo-run-values Mon, 18 Jan 2021 21:52:42 +0000 http://benhowell71.com/?p=437 Assigning a value to an individual pitch and its result is not a new concept. While we have stats like FIP and xwOBA that we can use to evaluate a pitcher as a whole, it’s important to know which pitches fare the best, whether it’s so a pitcher can optimize their arsenal or a hitter knows what to look for when they’re at the plate.

One way to do that is to look at a variety of stats, like CSW% or xwOBA. Yet, those stats look at a subset of results, whether it be called or swinging strikes, or the result of a plate appearance. What they don’t account for, is how much a result is worth in how it changes the structure of the at-bat.

Intuitively, we know that 0-2 counts are much better than 3-0 counts for a pitcher; after all, you’re only one pitch away from a strikeout. What Run Value tries to do is account for the change in count states, like going from 0-0 to 0-1 or 2-2 to 3-2, by how run potential changes from state to state, usually on a pitch type level.

Take this 99-mph Ahn Woo-jin fastball that lands out of the zone. That’s a tough pitch to hit velocity-wise, but since it’s outside of the zone, the likelihood of allowing a hitter-friendly outcome to Kim Jae-hwan rises (FWIW, Ahn’s fastball has the fifth-best RV/100 among fastballs). We want a way to measure how effective a pitch is, rather than a measure based on pitch characteristics like velocity, spin, and movement. With Run Value, this is a “bad” pitch because it takes the count from 2-2 to 3-2.

Ahn Woo-jin throwing 99 mph (160 km/h)

You can find the MLB leaders on Baseball Savant, where they define Run Value as “the run impact of an event based on the runners on base, outs, ball, and strike count”.  This project was inspired by Ethan Moore’s quest to create an xRV metric for the 2020 season, which you can find here.

With the nearly 30,000 pitches that I charted from the 2020 KBO season, I created Run Values for the KBO and the pitches that I tracked and will be detailing that process and some of my findings in this article. You can find the code for this project here and can check out the leaderboard that I added to the KBO Wizard. Back in July, I wrote about some of the Nastiest Pitches in the KBO, but using Run Values, we can see which pitches were the most effective.

The wOBA weights referenced in this article are from Statiz, and you can find their version of a ‘GUTS!’ page here.

I have previously detailed my process of calculating wOBA and Estimated xwOBA (ExwOBA) here, and I base my Run Values per count on ExwOBA. The first step of calculating Run Value is figuring out how the average result (ExwOBA in this case) changes by count state in the KBO. From there, it’s a matter of translating the ExwOBA into a measure of runs; fortunately, we can translate wOBA into Weighted Runs Above Average (wRAA), a measure of how many runs a hitter contributed, with zero as the league average.

The equation is relatively simple, involving the wOBA value, league average wOBA, and the wOBA scale: wRAA = ((wOBA – league wOBA) / wOBA scale) × PA.

We apply the formula to our values and return the wRAA value for each count state (Statiz lists the KBO average wOBA as 0.347. However, the charted data skews toward the best pitchers in the KBO, resulting in an average of 0.339, which is the value used for the KBO average wOBA).

The next step is figuring out how a ball and strike changes wRAA in each count state. To do that, we take the wRAA for an 0-1 count (0.024) and subtract the wRAA for a 0-0 count (0.032). In this instance, the value of throwing a strike on an 0-0 count, leading to a 0-1 count is -0.008. This makes sense; throwing a first-pitch strike is detrimental to a hitter (hence the negative value), but not by too much; the KBO has a high Ball in Play%, so first-strikes aren’t as crucial as they maybe in MLB.. The value of throwing a strike or ball in each count is listed below.

CountwOBAExwOBAValue of StrikeValue of Ball
000.3820.377-0.0080.018
010.3460.367-0.0280.001
020.2860.334-0.288-0.112
100.4020.398-0.026-0.009
110.4250.368-0.1420.024
120.2040.204-0.1760.019
200.3690.3870.0080.264
210.3880.396-0.1470.161
220.2220.226-0.1940.132
300.6870.693-0.0960.035
310.5750.582-0.1750.131
320.3720.379-0.3270.306

About 95% of these values make sense: throwing strikes is good for a pitcher (negative values), and throwing balls is bad (positive values). However, there are a few oddities, notably the value of throwing a strike in a 2-0 count, which is worth 0.008 wRAA, a bad result for a pitcher. The ExwOBA in a 2-0 count is slightly lower than it is in a 2-1 count, which is responsible for this discrepancy. Given that the K% in the KBO is significantly lower than MLB, I’m not surprised that there’s very little difference between 2-0 and 2-1 counts.

Another strange result is that throwing a ball in an 0-2 count is a good result for a pitcher with a -0.112 value, though it pales in comparison to the -0.288 for throwing a strike (which leads to a strikeout). My thought here is that 0-2 counts increase the likelihood of a bad swing/batted ball, while a called ball is designed to get a chase swing. If the hitter doesn’t swing, a 1-2 count is still a tough situation.

There’s another strange result where a ball in a 1-0 count is worth -0.009, good for the pitcher. The ExwOBA in 2-0 counts is slightly lower than 1-0 counts, something that I chalk up to the “small” data set of 30,000 pitches that I charted.

On the whole, these results are similar to ones found by Dan Meyer. The KBO and MLB have different play styles (which I believe accounts for some of the stranger results). I proceeded with this project after exploring why those results occurred.

From there, it was a matter of re-structuring the results and joining them to my charted data. Once we did that, we were able to produce final Run Value results for pitchers and their pitch types in the KBO from the 2020 season. The leaders in Run Value per 100 pitches are shown below.

PlayerPitchUsg%VelocityRun ValueRV/100
Aaron BrooksSlider23.486.7-22.9-6.7
Dan StrailySlider33.583.8-34.3-5.9
Jake BrighamSlider19.983.9-13.7-5.9
Chris FlexenCurveball12.576.1-12.5-5.9

Here is a representation of Run Value on every charted KBO pitch; it’s interesting to note that pitches on the corners of the zone fared the best consistently. After all, if you get a swing that far out, it’s bound to be a good result for a pitcher. The effect of called strikes and fouls (which were treated as a strikes for this project) also help to keep the strike zone a place worth attacking.


Here’s a breakdown of some of the top pitches in the KBO by RV/100.

Aaron Brooks’ slider leads the way by a large margin, with a -6.7 RV/100. It’s worthy of the title, with a 23.7% SwStr% (highest on an individual pitch in the KBO), a 0.224 ExwOBA, and a 61% GB%. Hitters stood no chance against it all year long, especially when he paired it with his changeup. It was most effective working down-and-away from RHH or on the edges of the zone against LHH.

Aaron Brooks sliders

Dan Straily’s slider narrowly beats out two other breaking pitches for the second spot, with a -5.9 RV/100. He rode the pitch to 205 strikeouts in 2020 (making him the world-leader in Ks in 2020), and you likely saw his highlights all over Twitter back in May. He threw it 34% of the time, netting a 20.6% SwStr%.

He threw it in all kinds of high leverage situations, like 40% of the time in 0-2 counts. Throwing the pitch in those situations, and getting strikes, is why his slider grades out at the second-best pitch in the KBO. Against LHH, he worked his slider on the edges of the zone while it acted as a whiff pitch down-and-away from RHH.

Dan Straily Sliders

Jake Brigham’s slider comes in fourth place here at -5.9 RV/100. Interestingly, his curveball is fifth at -5.5 RV/100. But, despite those two stellar pitches, Brigham wasn’t re-signed by the Kiwoom Heroes (though he has since joined the CPBL in Taiwan) due to a poor track record of health in 2020. Regardless, his breaking pitches amounted to 43% of his total pitches and were phenomenal offerings. I thought he should try to increase his usage of the pitches, and we’ll see if he makes any adjustments in the CPBL.

Jake Brigham slider/curveball/curveball mix

Fourth-place belongs to Chris Flexen and his curveball. The 26-year-old Flexen signed with the Seattle Mariners following his stellar 2020 season, and the curveball was part of the reason why; it sported a 22% SwStr%, and a 0.197 ExwOBA; that’s pretty good.

While his curveball velocity held at about 76 mph, the Mariners liked how the shape of his curveball changed in the KBO, becoming a pitch that may play better against MLB hitters than his original offering. The pitch was extremely effective down against LHH, while he showed the ability to throw it for whiffs or called strikes against RHH.

A fastball/curveball overlay from Flexen to showcase how well they work together

This stat is modeled on the Run Value and RV/100 hosted on Baseball Savant, though the two shouldn’t be compared to each other (naturally, given that they are for two separate leagues). In 2019 (using the full MLB season to comp to the full KBO season), the best RV/100 belonged to Felipe Vazquez’s slider at -6.2. The highest cumulative total was Gerrit Cole’s fastball at -36 Run Value; in the KBO, the lowest cumulative Run Value belonged to Raúl Alcántara’s fastball at -61 since he threw it 1417 times at a -4.3 RV/100 clip.

This is not a perfect stat, but another tool to help facilitate evaluating KBO pitches. While 0 RV/100 is an average pitch, many of the pitches in the KBO Wizard will register with an RV/100 better than 0 since I focused on charting the good pitchers, not the below-average pitchers. This stat is best used as a comparison to other pitchers and their pitches.

You can find a full leaderboard of Run Values and RV/100 on the KBO Wizard.

]]>
437
KBO End of Season Reports: Koo Chang-mo http://benhowell71.com/koo-chang-mo/?utm_source=rss&utm_medium=rss&utm_campaign=koo-chang-mo Mon, 04 Jan 2021 16:06:41 +0000 http://benhowell71.com/?p=430 With his slider and splitter combo, Koo Chang-mo electrified the world in the early parts of the KBO season. How could that skill and potential set Koo up for MLB success? Koo-Chang-mo-Report

Check out Koo Chang-mo’s player page on the KBO Wizard and enjoy this GIF of his slider and splitter combination.

]]>
430
KBO End of Season Reports: Ahn Woo-jin http://benhowell71.com/ahn-woo-jin/?utm_source=rss&utm_medium=rss&utm_campaign=ahn-woo-jin Sat, 02 Jan 2021 20:01:53 +0000 http://benhowell71.com/?p=426 Sporting an MLB-caliber fastball, Ahn Woo-jin is one of the best strikeout pitchers in the KBO. At 21 years old, he’s still developing his approach; here’s why MLB teams should be interested once he’s eligible to be posted. Ahn-Woo-jin

Check out Ahn Woo-jin’s profile page on the KBO Wizard and enjoy this GIF of him touching 99 mph on his fastball against Kim Jae-hwan:

]]>
426