turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Prediction formula null values

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
##
##### Prediction formula null values

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 7, 2018 11:27 AM
(1523 views)

Hello,

After saving my prediction formula from GenReg, I'm seeing about 80 percent of my dataset has null values (.) in the predicted values column. I suspect it has something to do with several "levels removed" in some of my significant predictors. Any advice on how to handle this or how to pivot from this in order to arrive at a higher yield (of predicted values) would be appreciated. Thank you :)

- Tags:
- regression

4 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Bump to see if anyone can respond. Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

@samalar,

Can you share a reproducible example - so people can try and step through your workflow ?

You don't have to share any confidential data - you can either anonymize your data or use the sample data sets in JMP if possible.

Best

Uday

Uday

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 20, 2018 5:12 AM
(1409 views)
| Posted in reply to message from uday_guntupalli 02/20/2018 07:44 AM

Thanks for taking a look. Genreg is modeling 6 predictors to estimate Total_time. Under Effect tests (see screenshot below), var1 is highly significant but 4 levels removed; count1 is fine because it is a continuous variable. Var2, Var3, Var4 have several levels removed. The screenshot of Prediction formula for Total_time shows that about 90% of calculated value is null. I understand that I will have to reconfigure the categorical variables. Can you help explain how to handle "levels removed"? Is this why the model applies to only 10% of the data?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

A categorical factor with *k* levels will require *k-1* parameter estimates. You have many levels for each of your categorial variables. That translates into a model with many parameters to estimate. You don't have enough data to estimate your model. There are only 98 observations in the training set and 43 in the validation set. There is no way to validate your model since you don't observations with each level of all of those categories. JMP does its best to provide a fit to the data, but you need more data. You should rethink what model you wish to fit and the format of your data.

Dan Obermiller