During my last trip home from a JMP customer visit, my flight was delayed getting out of Raleigh, North Carolina, due to mechanical problems. This caused me to miss my connection in Philadelphia, and I ended up with a six-hour layover awaiting the next flight home to Rochester, New York, on a Friday.
As I sat patiently in the US Air lounge, I couldn’t help but notice the headline in a newspaper that read, “Airline on-time performance improves in April.” In my usual data-driven fashion, I decided to see if I could get some historical data to check out this claim. Sure enough, after some searching I found data at the Bureau of Transportation Statistics website that allowed me to pursue an analysis. After downloading the on-time airline data from 1995 through 2010 and performing the usual data cleansing, I chose to examine a control chart of the performance:
Sure enough, the Control Chart of the data shows some interesting patterns. At first glance, one notices that the low outliers occur frequently in December, which is not surprising because of the holidays and the potential for bad winter weather. In addition, it also shows that after 9/11/01 there was a significant mean shift upward in performance that lasted through 2003.
The performance then steadily deteriorated downward from 2003 through 2008, and it does appear that in late 2008 through April 2010 there may have been a performance improvement. The numbers displayed in the Control Chart show the alarms that indicate an out-of-control situation. Of course, I believe all travelers would love to see out-of-control alarms above the 3 sigma UCL (Upper Control Limit) as there was in November of 2009.
Well, I didn’t stop there. I also decided that an analysis of the on-time performance by month might be an interesting exercise. So I added a categorical column for month and ran a Fit Y by X where Y was the on-time performance and X was month, and I observed the following analysis:

I am not surprised to see that the months of December and January were not stellar-performing months, and I do clearly see the outlier in September 2001. But I do find it curious that the month of June fell in the same category as December and January, and it made me think that perhaps the summer months may have lower staffing due to vacations.
An additional piece of data that might be useful is the number of scheduled flights; if there were fewer flights scheduled, perhaps the on-time performance would be improved. A Control Chart of that data is below:
Could it be that the on-time performance is somehow related to the scheduled number of flights and that the most recent improvement may be due to that factor? I decided to model the on-time performance versus three factors: year, month and number of flights. I obtained the following output in the Prediction Profiler. It appears that the number of scheduled flights also plays into the on-time performance metric:

One final note: All of this analysis assumes that the flight time from destination to destination, which would be the specification to which on-time performance would be measured, was constant throughout this period. Perhaps that is the next piece of data that would be required to help crystallize this claim. I’ve uploaded the data set to the JMP File Exchange for all you fliers out there in case you would like to play with it. Let me know if you make any interesting discoveries.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.