Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Staff
Did LeBron James step up his game in the playoffs?

The Golden State Warriors beat the Cleveland Cavaliers to win the NBA championship despite the best efforts of LeBron James. With the Cavaliers depleted by injuries (particularly to Kevin Love and Kyrie Irving), James was faced with carrying his team against a very talented and well-rounded Warriors team. And he was most certainly up for the challenge, LeBron had an amazing series, shouldering even more responsibility than usual and making it competitive against the Warriors.

LeBron’s performance in the finals got me wondering: Can we pinpoint exactly when he started to increase his output? Did he step up his game for the finals in particular, or had he been ramping it up throughout the playoffs? Or maybe his performance in the finals was nothing unusual, although I seriously doubted that.

First things first. We should plot his data for the entire season. There are many ways to evaluate a basketball player’s impact on the court. But for our purposes, let’s just look at his points scored, rebounds and assists.

The data seem a little too noisy to say confidently where LeBron started to increase his output. It’s probably safe to say that his rebounds started to increase around game number 75 (which happens to be the beginning of the playoffs), but it is hard to say. So let’s see if we can use a statistical model to help us find the changepoints.

Finding the changepoints

One approach to finding changepoints in our response is to fit a model like

E(points in game 1) = $$\beta_0$$

E(points in game 2) = $$\beta_0 + \beta_1$$

E(points in game 3) = $$\beta_0 + \beta_1 + \beta_2$$

and so on. This model generalizes to:

E(points in game $$j$$) = E(points in game $$j-1$$) + $$\beta_j$$.

So anytime one of our $$\beta_j$$ is nonzero, we know that our mean has shifted up or down at game $$j$$. We can use a variable selection technique to tell us exactly which of those parameters should be nonzero. If we use the Lasso for estimation and selection (available in the Generalized Regression platform in JMP Pro), this model is a special case of a model called the fused lasso.

And the model says...

Let’s take a look at the results of our fused lasso model for LeBron’s points, rebounds and assists. The prediction functions for these models give us a much clearer picture than when we looked at the raw data. LeBron’s points remained constant throughout the regular season, started to increase throughout the playoffs and peaked during the finals. His rebounds steadily increased over the regular season, but increased more dramatically throughout the playoffs. Likewise, his assists jumped up during the playoffs as well.

You want your superstars to respond on the biggest stage, and I feel like LeBron truly did that. Things looked bleak when both Kevin Love and Kyrie Irving got injured in the playoffs, but the remaining Cavaliers were up for the challenge. The Warriors were expected to run them off the court, but the Cavaliers were able to make it a competitive and entertaining series, thanks in large part to LeBron’s historic performance. And this is high praise considering that the Cavaliers took out my beloved Atlanta Hawks in the Eastern Conference Finals!

Reference

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91-108.

Article Labels

There are no labels assigned to this post.

Visitor

Leroy Thacker wrote:

I loved this example! Could you/would you 1) provide the data for the example and 2) walk someone with limited abilities through the actual analysis that produced the example. Thanks!

Staff

Clay Barker wrote:

Hi Leroy,

Glad you like it! As for your questions...

1) There are lots of sites that where you can access NBA player data. Try using the "Internet Open" feature in JMP which is very convenient for getting data from a website into JMP.

2) I think it might be best if I write a follow-up post describing how I did the analysis in JMP Pro, so stay tuned!

Thanks,

Clay

Visitor

Mike Monaco wrote:

Hi Clay,

Great article! Very concise, great analysis and you reminded me about the "internet open" feature which I haven't used yet.

Mike

Visitor

Leroy Thacker wrote:

Thanks for the information on the Internet Open feature....you learn something new everyday! And I look forward to the follow-up post describing the analysis.

LRT

Visitor

mary wrote:

I still do not understand about the changepoints, can you tell me clay?

Staff

Clay Barker wrote:

Hi Mary,

Thanks for the question. In this setting, a changepoint is really just a point where the mean changes. So in the case of LeBron's assists, my model suggests that his mean number of assists is about 7 for the first 75 games of the season. Then the model suggests that at game 76 his mean number of assists jumps up to over 9 per game. So that makes game 76 a changepoint.

Thanks,

Clay

Visitor

Daniel wrote:

Hi Clay,

As you said, data seemed a little too noisy... and you model (or at least the visualization of its outcome which you presented with the chart lines) seems not to convey that noise or variation in the data...

Also, even in playoffs and finals some figures were worse than in the previous game and the chart of the model seems to only show flat or higher figures...

Am I missing anything?

Staff

Clay Barker wrote:

Hi Daniel,

That's right, I'm only graphing the fitted mean function for my model so it will not convey the variability in the data. I probably should have overlaid those functions on top of the original data so that the variability around the mean was more obvious.

The idea behind a changepoint model like I used is that the mean of a response is constant for a while until something happens and the mean either shifts up or down. Because there is still variation around the mean, there are individual games during the playoffs where he scored fewer points or had fewer rebounds. But he had still done enough in the playoffs to suggest that his mean points and rebounds had both shifted up.

Thanks,

Clay