Subscribe Bookmark RSS Feed

Re: Cannot enter a parament for JMP stepwise regression

ei8450

New Contributor

Joined:

Dec 5, 2017

 

I'm doing a stepwise regression to study the effect of some parameters on salaries. So:

Dependent variable is: salary 

Independent variable are: gender, yrs_in_rank, URM_NonURM, Professor, AssociateProfessor, Assistant Professor, Instructor, and Lecturer (Professor, associate Professor, Assistant Professor, Instructor and Lecture are the five ranks of the study pool, they are recoded as dummy variable based one source column)

 

When I try to enter the above parameters, most parameters are entered except Lecturer. In fact, I can only enter four out of five ranks. I dont know why. I suspect it might be due to collinearity, but how could JMP know they are collinear variable? Is there a way I could enter all of five ranks into the analysis?

 

Thank you!

SASJMP_Stepwise.png

2 ACCEPTED SOLUTIONS

Accepted Solutions
dale_lehman

Community Trekker

Joined:

Jan 29, 2015

Solution

This is a common issue that comes up.  JMP always leaves out one category from the model - it has to, or the model will be over-identified (in lay terms, if we know an inidividual is not in the first 4 classes, they must be in the 5th, so that information would be superfluoous - in more mathematical terms, the matrix would be singular and not invertible).  If you want to see the full set of categories, click on the red arrow at the regression output and ask for the Expanded Estimates.  The missing category will have a coefficient that, when added to all the other category coefficients, sums to zero.

dale_lehman

Community Trekker

Joined:

Jan 29, 2015

Solution

Try putting in one variable for rank instead of dummy variables for each rank.  With one column for rank (with values Prof, Assoc, Asst, Inst, Lect) and expanded estimates you will get what you want.  Putting in separate dummy variables will only permit you to put 4 of the 5 ranks and the effect of the Lecturer rank is buried within the intercept in the regression model (with each of the other rank coefficients interpreted as the effect of that rank either present or absent:  Lecturer is the baseline that these are compared with the way you have it formulated.

5 REPLIES
dale_lehman

Community Trekker

Joined:

Jan 29, 2015

Solution

This is a common issue that comes up.  JMP always leaves out one category from the model - it has to, or the model will be over-identified (in lay terms, if we know an inidividual is not in the first 4 classes, they must be in the 5th, so that information would be superfluoous - in more mathematical terms, the matrix would be singular and not invertible).  If you want to see the full set of categories, click on the red arrow at the regression output and ask for the Expanded Estimates.  The missing category will have a coefficient that, when added to all the other category coefficients, sums to zero.

ei8450

New Contributor

Joined:

Dec 5, 2017

 

Thank you dale_lehman for your very clear explanation. I understand and agree what you said about why the 5th variable cannot be entered. I understand statistically it doesn’t make sense to include the 5th variable if it can be linearly explained by the other 4. I'm still having trouble to explain the effect of 5th variable, in this case, it's the effect of lecturer rank. Can you give me some suggestions how to interpret this regression results, especially the effect of rank lecturer? Any advice/comment/suggestion would be much appreciated.

Also I tried to expand the estimates, it shows expanded results for selected parameter, and the 5th variables’ estimate still missing.SASJMP_Stepwise2.PNG

dale_lehman

Community Trekker

Joined:

Jan 29, 2015

Solution

Try putting in one variable for rank instead of dummy variables for each rank.  With one column for rank (with values Prof, Assoc, Asst, Inst, Lect) and expanded estimates you will get what you want.  Putting in separate dummy variables will only permit you to put 4 of the 5 ranks and the effect of the Lecturer rank is buried within the intercept in the regression model (with each of the other rank coefficients interpreted as the effect of that rank either present or absent:  Lecturer is the baseline that these are compared with the way you have it formulated.

Highlighted
ei8450

New Contributor

Joined:

Dec 5, 2017

Thanks a lot dale_lehman. Now I understand and know how to do it.

Very appreciate it! Thanks again.

freshcalendars

Community Member

Joined:

Dec 7, 2017

Thank you so much  I fixed my problem.

 2be1e3ddce75f34981a53e83cd38aac0.jpeg