BookmarkSubscribe
Choose Language Hide Translation Bar
frits_quadt
Community Trekker

Stepwise Logistic Regression

I carried out a stepwise logistic regression in JMP 13.2.1 using P-value Threshold (Prob to enter=0.2, Prob to leave=0.1) and mixed direction. On the same dataset the same analysis in SAS 9.4 yielded a different selection (using Selection=stepwise, slentry=0.2 and slstay=0.1).

 

I expected to get at least the same selection of variables. Did anyone find something similar or does anyone know what the reason for this different result is?

0 Kudos
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Stepwise Logistic Regression

You did not describe your proposed terms or the specific differences, so I will have to guess. Do you have cross terms or power terms in your model? That situation is one way that might lead to different results. I have not used the SAS stepwise procedure in a long time so I can't remember how it treats hierarchical terms. JMP by default uses the Combine option for Rules:

 

Calculates p-values for two separate tests when considering entry for a term that has precedents. The first p-value, p1, is calculated by grouping the term with its precedent terms and testing the group’s significance probability for entry as a joint F test. The second p-value, p2, is the result of testing the term’s significance probability for entry after the precedent terms have already entered into the model. The final significance probability for entry for the term that has precedents is max(p1, p2).
Tip: The Combine rule avoids including non-significant interaction terms, whose precedent terms may have particularly strong effects. In this scenario, the strong main effects may make the group’s significance probability for entry, p1, very small. However, the second test finds that the interaction by itself is not significant. As a result, p2 is large and is used as the final significance probability for entry.
Caution: The degrees of freedom value for a term that has precedents depends on which of the two significance probabilities for entry is larger. The test used for the final significance probability for entry determines the degrees of freedom, nDF, in the Current Estimates table. Therefore, if p1 is used, nDF will be the number of terms in the group for the joint test, and if p2 is used, nDF will be equal to 1.
The Combine option is the default rule.
 

This option can dramatically change the selected model because of the way it computes F ratio and degrees of freedom for each term.

Learn it once, use it forever!
2 REPLIES 2

Re: Stepwise Logistic Regression

You did not describe your proposed terms or the specific differences, so I will have to guess. Do you have cross terms or power terms in your model? That situation is one way that might lead to different results. I have not used the SAS stepwise procedure in a long time so I can't remember how it treats hierarchical terms. JMP by default uses the Combine option for Rules:

 

Calculates p-values for two separate tests when considering entry for a term that has precedents. The first p-value, p1, is calculated by grouping the term with its precedent terms and testing the group’s significance probability for entry as a joint F test. The second p-value, p2, is the result of testing the term’s significance probability for entry after the precedent terms have already entered into the model. The final significance probability for entry for the term that has precedents is max(p1, p2).
Tip: The Combine rule avoids including non-significant interaction terms, whose precedent terms may have particularly strong effects. In this scenario, the strong main effects may make the group’s significance probability for entry, p1, very small. However, the second test finds that the interaction by itself is not significant. As a result, p2 is large and is used as the final significance probability for entry.
Caution: The degrees of freedom value for a term that has precedents depends on which of the two significance probabilities for entry is larger. The test used for the final significance probability for entry determines the degrees of freedom, nDF, in the Current Estimates table. Therefore, if p1 is used, nDF will be the number of terms in the group for the joint test, and if p2 is used, nDF will be equal to 1.
The Combine option is the default rule.
 

This option can dramatically change the selected model because of the way it computes F ratio and degrees of freedom for each term.

Learn it once, use it forever!
frits_quadt
Community Trekker

Re: Stepwise Logistic Regression

Thanks Mark,

I have had a quick look and it seems that, at least for this data set, JMP gives the same solution if the rule is set to restrict.
I will do some further checks as I have more datasets and SAS results from this client.
0 Kudos