NOTE: Presentation Slides and JMP Data Table are attached at the bottom of this page.
Seven years worth of Federal Aviation Administration (FAA) Air Route Traffic Control Center (ARTCC) data (FY06-FY12) were downloaded from the FAA Air Traffic Activity System (ATADS) website. One goal was to develop predictive models for the amount of air traffic at any of 21 ARTCCs for any given day of the year - barring the impact of severe weather.
The data were split into three random subsets. The largest portion of the data, 60%, was used to train prediction models. Another 20% were used to validate the models - in effect they were used to "tune" the model parameters. The remaining 20% had nothing to do with model development and were used to test how accurately the various models predicted.
Data were modeled with and without using Federal Holidays as a factor. It became obvious that "holiday" travel data were often outliers to models that did not include this factor. Three types of models were used; partition (decision tree), neural net, and 2nd order polynomial models. Below is a plot of Actual vs. Predicted air traffic totals for Thanksgiving Day in the test subset for a 2nd order polynomial model. One can see in the plot on the right (slope nearly 0.6) that the actual Total number of flights fall well below the predictions for the model NOT including Federal Holiday as a factor. The plot on the left shows nearly all the actual vs. predicted flight totals - in the test data set - falling on a 45 degree line (slope nearly 1) indicating much higher accuracy.
Below is JMP's Prediction Profiler for two models predicting Total air traffic for ZDC (the ARTCC for Washington, DC) for Thanksgiving Day in FY 2010. This day was not used in creating the model and is part of the 20% (nearly 11,000 data points) in the test data set. Actual total number of flights was 4265. The top model includes Federal Holiday as a factor and predicts 4330 (high by 65 or 1.5%). The bottom model does not include Federal Holiday as a factor and predicts 7167 (high by 2902 or 68%).