Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Sep 3, 2016 11:00 AM
| Last Modified: Dec 6, 2016 9:57 AM
The Summer Games are over, and here's one thing that surprised me. I had assumed that since Rio is in the southern hemisphere, where it’s currently winter, the Games would be shifted a couple months, as they were for Sydney. I’ve since learned that Rio is very pleasant in the winter with highs typically in the 70s Fahrenheit, so no delay was necessary.
How did the Rio opening date compare to past Games? I set about to collect Summer Games dates and discovered a few interesting things in the process.
Getting the Data
I found a few sources of date information, and naturally they didn’t always agree with each other. One source of discrepancy is that there is more than one way to define the “opening” of the Games: The opening of the competitions sometimes happens before the actual opening ceremony. In fact, this year soccer matches started two days ago.
I decided to collect dates for the opening and closing ceremonies, and started with dates from Wikipedia pages. The information there follows a regular pattern, and I was able to scan the dates with JSL (JMP Scripting Language) using regular expressions to tease out the date from inside the "td" tags:
For Each Row(
url = "<a href="http://en.wikipedia.org/wiki/">http://en.wikipedia.org/wiki/</a>" || Char( :year ) || "_Summer_Olympics";
text = Load Text File( url );
:Opening Date = Try( Regex Match( text, "Opening ceremony<.+?<td>(.+?)</td>" ), "" );
:Closing Date = Try( Regex Match( text, "Closing ceremony<.+?<td>(.+?)</td>" ), "" );
Wait( 5 );
That worked well, but there was one glitch: The pages didn’t agree on the date formats, as you can see from this snippet of the imported table:
Not terrible, but it did complicate my parsing a little. I thought about editing the Wikipedia pages myself to standardize the dates, but I got only as far as checking the Wikipedia style guide , where it turns out both formats are perfectly acceptable.
The next issue was that some of the early Summer Games durations were suspiciously long. The Wikipedia page for Paris 1924, for instance, has the opening ceremony date as May 4 and the closing on July 27. For those, I dug a little deeper. Fortunately, many of the official reports have been scanned and made available online. In the report for Paris 1924, I found this wonderful table of events.
If you look closely, you can see the ceremonie d’ouverture on July 5, along with competitions happening well in advance of that, including art competitions in March and April.
Pulling It Together
Back to my data quest ... I stuck with using opening ceremony dates for showing the timing of the core of the Summer Games even if not capturing all events. Research led to a few other refinements, and I excluded the first three modern Games (Athens, Paris, and St. Louis) since they did not have a similar structure. Some sources say the London 1908 were the first truly modern Summer Games in structure.
Here is a chart of the durations of each Summer Games since 1908, based on the opening and closing ceremony dates.
The Rio dates are indeed typical (only a few days off from the median, in fact). We can see expectedly shifted dates for other southern hemisphere hosts (Sydney 2000 and Melbourne 1956), but I don’t know what accounts for the late starts for Tokyo 1964 and Mexico City 1968.
The earlier Summer Games didn’t call out the opening and closing ceremonies as distinctly as we do today, so sometimes it took a little digging to identify the final “soirée” or “farewell banquet.” Along the way, I found a few gems. The official reports sometimes ran 1,000 pages and contained many photographs. The Stockholm 1912 report was particularly complete, including details on the stadium showers, turf construction and a photo of the royal box.
There is even a table showing the number of mailings sent by the Swedish Olympic Committee by month.
I just had to take a look at the data on a graph:
While you can see the increase in activity leading up to the actual Summer Games by looking at the numbers in the table, the pattern is much clearer in the chart: a steady build-up and then an explosion for the first six months of the year. It’s interesting that the drop-off precedes the games by a month or so. Is that a reflection of slower international mail delivery in 1912?
The Antwerp 1920 Summer Games went bankrupt and didn’t produce an official report, but one was compiled decades later. This table has an outlier that has to be an error:
Gymnasts numbering 1,648 would give Belgium as many competitors as all the other nations combined. And elsewhere I saw that the maximum size for the gymnastics team was 60 athletes. Still, if it’s an error, it’s hard to believe it wasn’t questioned when making the totals in the bottom row. Any data sleuths out there to track down the true number?
I love this photo of a walking race from Stockholm 1912 and had to include it:
Notice the judge checking the form of the walkers! And I happened upon this related comment in the London 1908 report:
I hope you enjoyed the Summer Games as much as I did, and I'm still hoping for Ultimate Frisbee to make it into the competitions someday.