Analyzing and Improving an MLB Pitcher's Decision Making and Execution with Machine Learning (2021-US-EPO-871)
A pitcher in Major League Baseball relies on a combination of strategy, deception, skill, and execution to be successful. Their sole job is to get an out for each batter they face.
To do so, they analyze the situation for each pitch and ask themselves questions like, "How many balls and strikes are there? Are any runners on base? What’s the score?” They also consider decisions like “What pitch should I throw and where? How fast should I throw it?”
Based on the answers to these questions, the pitcher will carefully decide which approach he thinks has the best chance of sending each batter back to the dugout. The release point of the ball, spin rate, and breaking amount all contribute towards physically executing each pitch to the best of his ability.
Most pitchers are likely aware of their biggest strengths, but do they have any hidden strengths that aren’t being used to their full potential? How does a pitcher’s actual success and potential success stack up with others?
These questions are answered using JMP 16 Pro’s new enhanced log and model screening features, JMP’s R functions to access Bill Petti’s “baseballr” package from Baseball Savant at MLB.com, and more.
Important Note
-In the abstract (and at 1:28 in the video), I mention using JMP's R functions to access the "baseballr" package. This data was originally accessed a couple of years ago under different versions of JMP and R, and unfortunately they are not currently compatible. I have included two options below for accessing the data used in the presentation.
- Option 1: Copy the contents of "baseballr_2018_season_accumulation_script.txt" to an R script and run it in R Studio.
- This option is longer than Option 2, but it shows you how the data is pulled in R via the "baseballr" package
- Must have R software and R Studio installed - https://cran.r-project.org and https://www.rstudio.com/products/rstudio/download
- The R script will take several minutes to run due to data size limitations per pull and will save "MLB 2018 Regular Season.csv" to your current working directory in R Studio
- Option 2: Download "MLB 2018 Regular Season.csv" from the following link:
To Run the Enhanced Log Script
-Open "MLB 2018 Regular Season.csv" in JMP 16 Pro (this will take a few minutes)
-Run "enhanced_log_script.jsl" to perform the data manipulation and model screening actions explained in the presentation