Suppose I fit the model using the missingness of DEBTINC as an indicator variable. The RSquare in this one-regressor model is better (.25), and I use all 5,960 rows to predict regardless of whether the data is missing. So missingness can be predictive, informative, and even outperform regular regressors.
The new idea is to create not only the missing value indicator variables but also make the original regressors useful without discarding rows of missing data.
In JMP Pro 11, the Fit Model launch window includes a new red triangle menu item called “Informative Missing.” Although the feature is only in JMP Pro, you can, with some effort, achieve the same goal by adding formula columns to the data table, as described in this article.
Number of Observations Used
Old full model with row-wise missing exclusion
|One-variable model with DEBTINC is Missing||5,960|
|New full model with Informative Missing||5,960|
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.