Missing genuine effects is bad but identifying false effects can be worse
Feb 28, 2017 7:36 AM
| Last Modified: Mar 28, 2017 12:54 PM
Robert Anderson JMP Senior Systems Engineer
Scientists and engineers need to be able to find the best possible model for their process or product and correctly identify which factors are genuinely important and which are not. Often the greatest concern is that an important or vital factor will be missed. However, a more insidious and potentially worse problem is that statistical modelling methods frequently identify factors which are statistically significant but not genuinely active. The identification of these non-genuine effects results in valuable scientific and engineering resources being squandered on further investigations of these false effects. It may take considerable time and resources before the flawed model and non-genuine effects are recognized. The holdback validation methodology in JMP Pro provides a powerful way of suppressing this model over fitting problem even with relatively small datasets. Using simulated data sets, this paper will demonstrate how frequently and easily the problem of detecting non-genuine effects can occur and how holdback validation can effectively suppress this problem. However, holdback validation is not a magic bullet and some examples of when it doesn’t work well will also be shown.