cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
eldad_galili
Level III

nominal logistic fit

Hi , In nominal logistic fit. I get a significant model, all parameters are significant by the effect LRT. but all the parameters estimations are not significant.  how can it be??

1 ACCEPTED SOLUTION

Accepted Solutions

Re: nominal logistic fit

The cause is collinearity among the independent variables.

The overall test just says that the model including all the variables is better than the mean of the response.

The test for significance of the individual parameters is conditioned on all the other parameters in the model.

If there is substantial correlation between two independent variables, both important in explaining the response, then neither will be significant with the other in the model.

Collinearity can be more complicated than just two variables being correlated.  It can happened if one variable is correlated with a linear combination of several other variables which is common when one has many independent variables.

The solution in general is to identify the subsets of variables that are highly correlated among each other and use a subset that is not too correlated.

Not being a JMP expert, more SAS, I will leave it to others to give you more specific advice on how do identify the highly correlated subsets

Some ideas that come to mind are looking at pairwise correlations.  Another idea is to use the partition platform to pick candidate independent variables.

If your goal is prediction, then using recursive partitioning is fine.  If you goal is inference, using recursive partitioning to select variables will bias the p-values small and make any inference suspect.

View solution in original post

7 REPLIES 7

Re: nominal logistic fit

The cause is collinearity among the independent variables.

The overall test just says that the model including all the variables is better than the mean of the response.

The test for significance of the individual parameters is conditioned on all the other parameters in the model.

If there is substantial correlation between two independent variables, both important in explaining the response, then neither will be significant with the other in the model.

Collinearity can be more complicated than just two variables being correlated.  It can happened if one variable is correlated with a linear combination of several other variables which is common when one has many independent variables.

The solution in general is to identify the subsets of variables that are highly correlated among each other and use a subset that is not too correlated.

Not being a JMP expert, more SAS, I will leave it to others to give you more specific advice on how do identify the highly correlated subsets

Some ideas that come to mind are looking at pairwise correlations.  Another idea is to use the partition platform to pick candidate independent variables.

If your goal is prediction, then using recursive partitioning is fine.  If you goal is inference, using recursive partitioning to select variables will bias the p-values small and make any inference suspect.

eldad_galili
Level III

Re: nominal logistic fit

Thank you very much for the replay, I will check if I have correlation between my parameters

susan_walsh1
Staff (Retired)

Re: nominal logistic fit

Are your parameter estimates labeled "unstable"? If so, you might want to take a look at this JMP note:36686 - When I run nominal or ordinal logistic regression in JMP®, I receive parameter estimates lab...

TCM
TCM
Level IV

Follow up-->Re: nominal logistic fit

Just want to clarify this response that is in the link below.  The original question relates to unstable parameter estimates in nominal logistic regression.

     "This is a common issue. It is caused by some parameters of the model becoming theoretically infinite. This can happen when the model perfectly predicts the response or if there are more parameters in the model than can be estimated by the data (that is, with sparse data, where "sparse" means that there are few or no repeats of each setting of the covariates). One solution is to reduce the number of variables and/or change continuous variables to categorical. There is no way to know which variable to eliminate or categorize because all are involved simultaneously. The resulting model is usually good at classifying observations, but inferences about the parameters should be avoided."

 

Do I take it that even with unstable, biased or zeroed parameter estimates, the model may still be used if only for classifying?

Re: Follow up-->Re: nominal logistic fit

These warnings and indicators suggest a poor model. I would not use it to predict outcomes (i.e., classify). I would correct the poor conditions and verify the model assumptions first.

Peter_Bartell
Level VIII

Re: nominal logistic fit

Another potential approach if multicollinearity among your predictor variables is an issue is, if you are running JMP Pro, the Generalized Regression Lasso/Elastic Net personalities are viable modeling options as well.

eldad_galili
Level III

Re: nominal logistic fit

Thanks, but I am not using the pro version, too much money