Choose Language Hide Translation Bar

Corona Virus Risk Analysis: Statistical Analysis Should Be As Simple As Possible, But No Simpler (2020-US-45MP-598)

Level: Intermediate


Roland Jones, Senior Reliability Engineer, Amazon Lab126
Larry George, Engineer who does statistics, Independent Consultant
Charles Chen SAE MBB, Quality Manager, Applied Materials
Mason Chen, Student, Stanford University OHS

Patrick Giuliano, Senior Quality Engineer, Abbott Structural Heart


The novel coronavirus pandemic is undoubtedly the most significant global health challenge of our time. Analysis of infection and mortality data from the pandemic provides an excellent example of working with real-world, imperfect data in a system with feedback that alters its own parameters as it progresses (as society changes its behavior to limit the outbreak). With a tool as powerful as JMP it is tempting to throw the data into the tool and let it do the work. However, using knowledge of what is physically happening during the outbreak allows us to see what features of the data come from its imperfections, and avoid the expense and complication of over-analyzing them. Also, understanding of the physical system allows us to select appropriate data representation, and results in a surprisingly simple way (OLS linear regression in the ‘Fit Y by X’ platform) to predict the spread of the disease with reasonable accuracy. In a similar way, we can split the data into phases to provide context for them by plotting Fitted Quantiles versus Time in Fit Y by X from Nonparametric density plots. More complex analysis is required to tease out other aspects beyond its spread, answering questions like "How long will I live if I get sick?" and "How long will I be sick if I don’t die?". For this analysis, actuarial rate estimates provide transition probabilities for Markov chain approximation to SIR models of Susceptible to Removed (quarantine, shelter etc.), Infected to Death, and Infected to Cured transitions. Survival Function models drive logistics, resource allocation, and age-related demographic changes. Predicting disease progression is surprisingly simple. Answering questions about the nature of the outbreak is considerably more complex. In both cases we make the analysis as simple as possible, but no simpler.


Nice work @Roly I look forward to watching the presentation video in October! -@PatrickGiuliano