Solved: Re: Follow up-->Re: nominal logistic fit - JMP User Community

Choose Language Hide Translation Bar

Click here to refresh translation

View Original Published Thread

Level III

Jun 2, 2016 04:25 AM 13585 views

Hi , In nominal logistic fit. I get a significant model, all parameters are significant by the effect LRT. but all the parameters estimations are not significant. how can it be??

1 ACCEPTED SOLUTION

Accepted Solutions

Level I

Jun 2, 2016 01:34 PM (15308 views) | Posted in reply to message from eldad_galili 06-02-2016

The cause is collinearity among the independent variables.

The overall test just says that the model including all the variables is better than the mean of the response.

The test for significance of the individual parameters is conditioned on all the other parameters in the model.

If there is substantial correlation between two independent variables, both important in explaining the response, then neither will be significant with the other in the model.

Collinearity can be more complicated than just two variables being correlated. It can happened if one variable is correlated with a linear combination of several other variables which is common when one has many independent variables.

The solution in general is to identify the subsets of variables that are highly correlated among each other and use a subset that is not too correlated.

Not being a JMP expert, more SAS, I will leave it to others to give you more specific advice on how do identify the highly correlated subsets

Some ideas that come to mind are looking at pairwise correlations. Another idea is to use the partition platform to pick candidate independent variables.

If your goal is prediction, then using recursive partitioning is fine. If you goal is inference, using recursive partitioning to select variables will bias the p-values small and make any inference suspect.

View solution in original post

7 REPLIES 7

Level I

Jun 2, 2016 01:34 PM (15309 views) | Posted in reply to message from eldad_galili 06-02-2016

The cause is collinearity among the independent variables.

The overall test just says that the model including all the variables is better than the mean of the response.

The test for significance of the individual parameters is conditioned on all the other parameters in the model.

If there is substantial correlation between two independent variables, both important in explaining the response, then neither will be significant with the other in the model.

Collinearity can be more complicated than just two variables being correlated. It can happened if one variable is correlated with a linear combination of several other variables which is common when one has many independent variables.

The solution in general is to identify the subsets of variables that are highly correlated among each other and use a subset that is not too correlated.

Not being a JMP expert, more SAS, I will leave it to others to give you more specific advice on how do identify the highly correlated subsets

Some ideas that come to mind are looking at pairwise correlations. Another idea is to use the partition platform to pick candidate independent variables.

If your goal is prediction, then using recursive partitioning is fine. If you goal is inference, using recursive partitioning to select variables will bias the p-values small and make any inference suspect.

Level III

Jun 5, 2016 07:56 AM (13401 views) | Posted in reply to message from robertlucas1972 06-02-2016

Thank you very much for the replay, I will check if I have correlation between my parameters

Staff (Retired)

Staff (Retired)

Jun 3, 2016 08:43 AM (13401 views) | Posted in reply to message from eldad_galili 06-02-2016

Are your parameter estimates labeled "unstable"? If so, you might want to take a look at this JMP note:36686 - When I run nominal or ordinal logistic regression in JMP®, I receive parameter estimates lab...

Level IV

Nov 19, 2020 01:17 AM (3211 views) | Posted in reply to message from susan_walsh1 06-03-2016

Just want to clarify this response that is in the link below. The original question relates to unstable parameter estimates in nominal logistic regression.

"This is a common issue. It is caused by some parameters of the model becoming theoretically infinite. This can happen when the model perfectly predicts the response or if there are more parameters in the model than can be estimated by the data (that is, with sparse data, where "sparse" means that there are few or no repeats of each setting of the covariates). One solution is to reduce the number of variables and/or change continuous variables to categorical. There is no way to know which variable to eliminate or categorize because all are involved simultaneously. The resulting model is usually good at classifying observations, but inferences about the parameters should be avoided."

Do I take it that even with unstable, biased or zeroed parameter estimates, the model may still be used if only for classifying?

Staff

Staff

Nov 19, 2020 08:50 AM (3206 views) | Posted in reply to message from TCM 11-19-2020

These warnings and indicators suggest a poor model. I would not use it to predict outcomes (i.e., classify). I would correct the poor conditions and verify the model assumptions first.

Level VIII

Jun 3, 2016 01:40 PM (13401 views) | Posted in reply to message from eldad_galili 06-02-2016

Another potential approach if multicollinearity among your predictor variables is an issue is, if you are running JMP Pro, the Generalized Regression Lasso/Elastic Net personalities are viable modeling options as well.

Level III

Jun 5, 2016 08:01 AM (13401 views) | Posted in reply to message from Peter_Bartell 06-03-2016

Thanks, but I am not using the pro version, too much money