Feb 20, 2019 1:59 PM
| Last Modified: Feb 22, 2019 6:56 AM
“I agreed that what really matters is what you like, not what you are like…. Books, records, films – these things matter.” – Rob Gordon, John Cusack’s character in "High Fidelity"
I wouldn’t go so far as that, but I have always been a huge fan of movies (and I’d have to put "High Fidelity" in my top five). In the spring of 1996, as my friends and I were finishing up middle school, we went to see "Mission Impossible" and were obsessed, seeing it multiple times. It was also the summer that "Independence Day" was released, which my best friend’s dad took us to see opening day. We had a theater close enough to our house that we could easily be dropped off by our parents, or ride our bikes if it wasn’t too hot. Tickets were cheap, and I began a habit of saving the movie stubs for every movie I went to see, a practice I would continue for the next 20 or so years.
In many ways, movies can catalog the history of your life, just as your books, music collection, concert tickets, or comics might. There is a story there of what your interests were at the time, who you saw them with, and what they meant to you. I thought it would be an interesting trip down memory lane to gather up these tickets, enter them into JMP, and explore the data. Plus, with paper tickets becoming more rare, it seemed a good time for this project.
I collected my movie ticket stubs for 23 years. I decided to enter the data from them into JMP.It took a while just to enter the data from the ticket stubs. Only 10 years ago, they were printed on small pieces of regular paper (not the glossy cardstock used today), and so sometimes only part of the first word was printed depending on the title. A quick Google search usually turned up what movie it was based on the viewing date and that partial word if I couldn’t figure it out.
Some of the older tickets didn’t show the rating, but I found this info online as well. I thought I’d add the price; we all know prices go up over time due to inflation but might be worth quantifying. And while I was looking up movies, I might as well put their release date, and have a column to calculate how many days after release I saw it – a movie that I would fight the crowds to see opening night had to be important to me, more so as I got older and had less free time.
In some cases, including the release date was a little tricky. Some movies I saw were not in their original run – for example, for a while I watched martial arts movies, so I used the US release date not the original Chinese release date.
Genre can also be a little difficult. Is "Avengers" an action movie, or do superhero movies deserve to be their own genre? Do you lump all comedies together, or break out romantic comedies? Once you split out a sub-genre, are you unfairly skewing the statistics? I tended to see these as significant subgenres, but when multiple genres were listed on my online searches, I just went with the first one listed. I think the collection is pretty accurate, although I know there are gaps. I distinictly remember seeing the first "Ironman" movie in theaters with my dad, but I have no entry for that.A fairly good condition old ticket stub, the earliest in my data for "Mission: Impossible."The first thing I wanted to see was a history of the movies I saw. I used a Transform column to plot the viewing date as a month/year and added some reference lines and ranges for some major events in my life. I saw roughly 40 movies over the four years I was in high school (the green region), 72 over the four years I was in college (the red region), and one movie during the two years I was in grad school. Maybe surprising, but not when you consider the two reference lines in 2007 and 2010 – these are when my sons were born. I didn't have a lot of time for movies when I was in school and had a newborn. Certainly the data shows over time that I watched fewer movies in the theater, but what else can we tell?
Repeat viewings are significant. Many people rarely go out to the movies, let alone pay to see the same movie more than once. The data shows that I have actually seen 20 movies multiple times, the most being "X-Files: Fight the Future" (my favorite TV show in high school and into college) and "Avengers: Infinity War" (the culmination of 10 years of the Marvel Cinematic Universe) – I saw both three times.
Sometimes I went outside my normal types of movies to take my mom to see something (she tends to like historical, Oscar-worthy movies like "Seabuscuit," "Titanic," "War Horse"), or to take out a date ("10 Things I Hate About You," "Moulin Rouge," "Varsity Blues"). Occasionally, I saw things out of boredom in college, something random like "Bulletproof Monk," which incidentally has the dubious distinction of being the worst movie I paid to see in theaters (if I ignore "Spiderman 3" for the abomination that it is).
I am a huge comic book fan, so it is not suprising that many of my repeat viewings are superhero movies. I tend to see them first with my friends, and then again with my kids if they are appropriate. With interactive content published on JMP Public shown above, I can filter the content based on the type of discount. So if you select the "Missing" category (meaning that no discount of any kind was used), you can see all the movies I payed full price for. These account for many of the red cells and several purple cells, indicating that I saw it once on the release date, and then went back later to see it again at a matinee with the kids or at Blue Ridge Cinema, which for a while showed second-run movies for $1.50. Given that, I would expect that on average I would see my favorite genres a shorter time after release than others. Below I've plotted the mean days after release by genre, both as bars, and as boxplots with a data filter by movie title.
But wait, the data shows there are several genres that I go to the theaters quicker to see than superhero or action movies. Musicals for example has a value of 13 days after release, but there was only a single movie in that genre. Romantic comedies show a mean of roughly 24 days, but about 28 for adventure and 26 for superhero movies. That can't be right! For superhero movies, there are several outliers. I saw "Avengers: Age of Ultron" twice; the second time with my kids, it was 126 days after it was released (despite seeing it opening day the first time), similar for "Avengers: Infinity War." I wasn't that familar with the characters in the others, so I waited a while to see them; they are true indicators of how long I waited to see the movie. In the adventure genre, I saw "Star Wars Episode II" and "Lord of the Rings: The Two Towers" twice (the second viewings were 176 and 80 days after their initial release, respectively), throwing off the mean. You can hold down the Shift key, select these four points and, at the top menu button in JMP Public, choose "Exclude and Hide Selected Rows." These values are excluded from the analysis, and the report is regenerated. Now the adventure genre shows a mean of 14.6 days and superhero shows 21.6 days compared to 24.3 for romantic comedies. The box plots show that the medians are at two days for adventure, nine for superhero movies and 16 for romantic comedies, so it is important to consider how much the viewing times vary.
As I came to the end of entering my data, my ticket stubs began to dwindle. I still go to the movies fairly often, but more and more tickets are purchased online, either through a third party that emails you a ticket, or through apps provided by the theater chains that you can purchase tickets on, pick your seats, and they just scan a QR code at the theaters. I went back through my email and these apps and entered as many tickets as I could find (in the data set, there is a column for whether the ticket is digital or not). But sometimes I go to the movies with a group of friends, and someone else buys the tickets. There is no longer a ticket stub I can ask for after I pay them back. My ticket collection is at an end, and so I am glad to be able to capture it in this way.
Go over to JMP Public and try out the local data filters. You can explore the data to draw your own conclusions. Please take a look at the plots I've shared, and let me know what you think.