I am trying to develop a model to predict whether a tree will live or die over a 25 year period given some environmental attributes associated with each tree. I ran the Nominal Logistic Regression model, which showed that the Prob>ChiSq for each attribute was < 0.0001, which would seem to suggest that the environmental attributes could be used as a good predictor. Wanting to see how good of a prediction they were I selected the Save Probability Formula option. This added several columns to the table including the final column, which provided a prediction (Most Likely), Live or Dead. The predictions were terrible. The model predicted a total of 140 trees would be alive after 25 years, when in fact the number was about 1000. Not only that, but many of the 140 predicted live trees in fact had died. So, my question is, why were the predictions not even close to the actual Live/Dead data?