World Statistics Day was yesterday, but we’re celebrating all week long! This celebration means acknowledging the impact statistics has on our world. Who is your favorite statistician? Share with us who they are and why they top your favorites list.
NOTE: This entry comes to the JMP Blog from our colleague Jerome Bryssinck of SAS Belgium. Jerome had seen Jeff Perkinson's examples of basketball analytics using JMP and created his own example using football (or soccer) data. In response to comments from readers, Jerome updated his model on May 26, and this blog post now reflects those changes.
THE QUESTION: Has the game been decided yet? HTGBD
This is the question that most people constantly ask themselves when they are watching a football game. This question can take different forms depending on the circumstances. If you're lucky to support the winning team, you might ask yourself: "How secure is the lead?" And for the less fortunate of us: "Is there still a chance for my team to win?"
Graph1: Probability of the game having been decided in function of the elapsed time and the number of goals difference.
Graph1 shows the probablility of the game having been decided in function of the elapsed time and the number of goals difference. It is possible to change the elapsed time and the number of goal difference on the graph by clicking on a different value.
Some interpretation examples:
If Time=45 and Goal Difference=0: The game has been going on for 45 minutes, and the number of goal difference is 0. There is a 23% probability that the outcome of the game won't change. Here, as the teams are even (0 goal difference), this would mean that there is a 23% probability the game will end in a tie.
If Time=45 and Goal Difference=1: The game has been going on for 45 minutes, and one of the teams is leading by 1 goal difference, then we have a 60% probability that the outcome of the game won't change. Here, this would mean that the leading team has a 60% probability to win.
More Details about the Answer
The model used above has been built using data from the UK Premier League from 2002 to 2006. The type of model used is a regression model.
The following representations are useful to understand the underlying data.
Graph2:Has the Game Been Decided vs. Time
Graph2 shows the percentage of the games that have been decided in function of the Elapsed Time. I must say that I wasn't surprised by this graph, which basically states that the Elapsed Time and the HTGBD (Has The Game Been Decided) are directly proportional.
<img width='400' height='291' style="border: 0px; padding-left: 5px; padding-right: 5px;" src="http://blogs.sas.com/jmp/uploads/Soccer1.gif" alt="Graph in JMP of Has the Game Been Decided vs. Time By Goal Difference
Graph3: Has the Game Been Decided vs. Time By Goal Difference
Graph3 shows the percentage of the games that have been decided in function of the Elapsed Time by the number of goal difference. According to this graph, the number of goal difference is an excellent predictor for the HTGBD.