cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Nominal Logistic Regression question re: the Save Probability Formula

I am trying to develop a model to predict whether a tree will live or die over a 25 year period given some environmental attributes associated with each tree.  I ran the Nominal Logistic Regression model, which showed that the Prob>ChiSq for each attribute was < 0.0001, which would seem to suggest that the environmental attributes could be used as a good predictor.  Wanting to see how good of a prediction they were I selected the Save Probability Formula option.  This added several columns to the table including the final column, which provided a prediction (Most Likely), Live or Dead.  The predictions were terrible.  The model predicted a total of 140 trees would be alive after 25 years, when in fact the number was about 1000.  Not only that, but many of the 140 predicted live trees in fact had died.  So, my question is, why were the predictions not even close to the actual Live/Dead data? 

12 REPLIES 12

Re: Nominal Logistic Regression question re: the Save Probability Formula

Actually, you CAN plot more than one group with Life Distribution.

 

survival.JPG

MarkAD
Level III

Re: Nominal Logistic Regression question re: the Save Probability Formula

Yes, I did know this, but it's not the typical step-wise depiction of survival curves.  Also, this option provides the mean survival time for each category.  Thanks for all your attention and advice!

 

MarkAD_0-1612545223080.png

 

statman
Super User

Re: Nominal Logistic Regression question re: the Save Probability Formula

I'm a bit confused...Where did you get the data to create the model? The reason most models don't actually predict well is the data used to create the model was NOT REPRESENTATIVE of the future conditions.  This is why planning the data collection is way more important than the data analysis.  Although I will say that is a difficult situation.   Many things can change in 25 years that could or were not anticipated.

"All models are wrong, some are useful" G.E.P. Box