Subscribe Bookmark RSS Feed

JMJ 8: performing multiple linear regression

Hi !

I'm requesting your help on a statistical problem. I have a quantitative response variable that I would like to test. For this test I have 21 variables, 7 of them are qualitatives and 14 quantitatives. I would like to build a model with JMP 8 in order to see wich variables have a strong effect on my response variable. My problem is that i have to change, or code my qualitative variables in order to put them into my model but i don't know how to do it. I did try to code them by rick-clicking on my list of column and selecting coding but afterwards i just had the possiblity to change them into only one number. However i should code them with many more columns, shouldn't I ?

Thank you for your help and by the way sorry for my english !

See ya !

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

For a qualitative variable with n modalities, you get n-1 parameter estimates.That can be very confusing!

If you want to know if the variable "as a whole" has a significant influence on your response, look at the Effect Tests output - there you only have one row of output and one F test for the qualitative variable. If the F-test is insignificant - you can remove the variable. If the F-test is significant, it means that at least two of the modalities give significantly different response. If you want to know which are different, look at the Effect Details output, find the qualitative variable in question, press the red triangle next to the name and select LSMeans Tukey LSD - that will give you a list showing which are different.


Regarding R2 - you shouldn't optimize the model using R2, as R2 is prone to overfitting and will reduce when you remove any term. In stead use the Effect Tests and remove insignificant terms! This is the classical approach. There are other methods for finding the most correct model like crossvalidation (only in JMP Pro) and Mallow's cp (under Stepwise) - look them up in the Help.

BR, Marianne

3 REPLIES
MTOF

Community Trekker

Joined:

Jun 29, 2011

Hi.

Are you sure you need to recode the qualitative variables? Do you mean to 0-1 variables?

In Fit Model you can fit MLR models with quantitative as well as qualitative parameters without coding.

BR, Marianne

Thank you for your answer.

I thought that I had to recode the qualitatives variables into dummy variables (0-1--1...this sort of things). Apparently it's not necessary...That's a good thing. I have a lot of troubles to analyse the results of this analysis because for each qualitative variables with n modalities, only n-1 lines in the analysis appear for the qualitative variable. That is weird I d'ont get it.

For one qualitative variables with n modalities it's written in the analysis:

qualitative variables {modalitie1&modalities2&modalities3&modalitie4}

qualitative variables {modalitie1&modalities2&modalities3}

qualitative variables {modalitie1&modalities2}

qualitative variables {modalitie1}

and for each line, there are the sum of square, the F test...

I know the basis of this analysis, I have to put in my model some explicatives variables that are supposed to explain the variation of my response variable. Then, I need to adjust the model by putting or taking of some variables depending on the evolution of the R square. The basic aim is to get the higher R square possible. But in that model qualitative variables dont work the same way as quantitative variables. And it disturbs me a lot...

Solution

For a qualitative variable with n modalities, you get n-1 parameter estimates.That can be very confusing!

If you want to know if the variable "as a whole" has a significant influence on your response, look at the Effect Tests output - there you only have one row of output and one F test for the qualitative variable. If the F-test is insignificant - you can remove the variable. If the F-test is significant, it means that at least two of the modalities give significantly different response. If you want to know which are different, look at the Effect Details output, find the qualitative variable in question, press the red triangle next to the name and select LSMeans Tukey LSD - that will give you a list showing which are different.


Regarding R2 - you shouldn't optimize the model using R2, as R2 is prone to overfitting and will reduce when you remove any term. In stead use the Effect Tests and remove insignificant terms! This is the classical approach. There are other methods for finding the most correct model like crossvalidation (only in JMP Pro) and Mallow's cp (under Stepwise) - look them up in the Help.

BR, Marianne