Monte Carlo simulation of 2012 presidential election
Nov 13, 2012 2:58 PM
There were probably more state and national polls of the 2012 election for President of the United States than ever before. Despite this, many pundits were shocked by the election result. Most national polls showed the two candidates' expected vote percentages within each other's margin of error. As result, many commentators viewed the presidential election "too close to call."
The state-by-state polls told a very different story. Romney had uncontestable leads in 23 states, for 191 electoral votes. Similarly, Obama had 17 states solidly in his column, for a total of 212 electoral votes.
The key to the election was how the candidates would divide the electoral votes of the remaining 10 states. These states -- Colorado, Florida, Iowa, Nevada, New Hampshire, North Carolina, Ohio, Pennsylvania, Virginia and Wisconsin -- in many cases were polled multiple times per day by different organizations in the week leading up to the election.
There was substantial variation across the polls within each state. In almost all the states, there was at least one poll showing Obama in the lead and another showing Romney in the lead. By "cherry picking" the poll that favored a particular candidate, one could persuade oneself that one's favored candidate would win. Another interpretation, held by many pundits, was that the race was too close to call.
Neither of the above approaches makes use of current statistical thinking, which advocates averaging poll results to get a more precise estimate of the true percentage of the voting population favoring a particular candidate. A single poll has uncertainty due to two things:
1) The number of people polled
2) The methodology used to sample
Generally, the variability of an average of many measurements is smaller than the variability of individual measurements themselves. Also, averaging will tend to reduce the effect that different sampling approaches have on the results of individual polls.
By using poll averaging, Nate Silver of the FiveThirtyEight blog at The New York Times managed to correctly predict all 10 so-called "battleground states." Of the 10 battleground states, only North Carolina went to Romney as predicted by Silver. One could argue that Silver called Florida a toss-up, but in his final map, Silver colored Florida a light blue (indicating a slight preference for Obama) rather than yellow, which would have indicated a toss-up.
I was interested it Monte Carlo simulations of the election using the poll-averaged probabilities in each state. Both Silver and Sam Wang at Princeton University produced graphs of such simulations. Here is Wang's graph at the Princeton Election Consortium website.
A qualitatively similar plot appears in Silver's blog referenced above and shown below.
I wrote a script in JMP to reproduce this graph. An image of the script appears below.
The script generates the plot below using the Graph Builder in JMP.
Note that while the three graphs exhibit qualitative similarities, the y-axis values are different. This is due to different assumptions being made about how to model the uncertainty in the probability estimate for each state.
The actual electoral vote count for Obama was 332, which corresponds to the second highest line in my plot at a probability of 13-plus percent. The highest line (nearly 14 percent probability) at 303 electoral votes included all the same states except for Florida, which has 29 electoral votes. Note the line at 347. That would have been the result had my home state of North Carolina also voted for Obama.
The number of simulated results below the black vertical reference line at 270 is very small. This suggests that the election was not all that close and certainly not too close to call.