Data Mining for Vaccine Manufacturing
Julia O'Neill MS, Principal Engineer – Merck & Co. Inc.
Vaccine manufacturing is a complex biological process composed of many steps carried out over months or even years. Hundreds of raw material characteristics and process variables are monitored for every lot. Although the final vaccine product is well characterized and controlled, identifying the root causes of variation in the intermediate bulk material is extremely challenging. Teams of engineers, statisticians and scientists have begun to apply and develop data mining techniques to overcome these challenges. CUSUM sequence plots, partial least squares (PLS) regression and random forests have proven extremely valuable in recent projects. These data mining methods have set a new standard for vaccine root cause investigations within Merck. Their effectiveness will be illustrated with a case study.