cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
tranquilo123
Level I

Help with multiple regression

 

 

8 REPLIES 8
dale_lehman
Level VII

Re: Help with multiple regression

You have a lot of numbers here and a lot of questions - many of these require basic understanding of what a multiple regression analysis is and how to interpret coefficients, fit, etc.  But I'll focus on one big issue.  Your results indicate a number of significant factors that influence satisfaction (though they might be related, so you should check the VIF for multicollinearity) and overall a model that is significantly better than random (indicated by the many low p values).  But the overall fit (indicated by the R-square) is quite poor.  I am guessing that this is due to your response variable, satisfaction_level, being coded as continuous.  Since its mean value is 0.61, I suspect this variable is really ordinal - something like rate your satisfaction level on a scale of 0-4.  While it is possible to interpret this on a continuous scale (1 being better than 2, etc.) that is somewhat dangerous to do (the "distances" from 1 to 2 and 2 to 3 are not really "equal" in any meaningful sense).  So, you might want to consider modeling this with the Y variable being changed to ordinal.  I would also check to see if your independent variables are really measuring different things, since things like job evaluations, promotions, experience, etc. are often closely related to each other.

tranquilo123
Level I

Re: Help with multiple regression

Thank you for your answer. The response variable is a continuous variable and has a range between 0,0 - 1,0 where 0,0 is not-satisfied and 1,0 is satisfied. 

dale_lehman
Level VII

Re: Help with multiple regression

You should not be modeling the response variable as continuous.  It is a nominal variable with two levels (0 and 1).  As such, just change its type to nominal and the Fit Model platform will do a logistic regression.  In general, I think you will find similar qualitative results (as to which variables are significant factors) though the quantitative results will certainly differ.

VilijaOke
Level I

Please: Help with multiple regression

Hello,

I would like to ask how to adjust for a parameter in logistic regression: I do medical research and analyse outcome "heart attack" with  yes/no outcome and if parameters of inflammation have impact on the outcome, but I have to adjust for age, since it is a known risk factor for heart attack. Question: where should I input adjustment for "age"?

Thank you!

Re: Help with multiple regression

If your response is really two levels (not satisfied, satisfied), then use a nominal modeling type and treat it as a categorical response. This will change your linear regression to logistic regression.

If your response is really continuous (satisfaction from 0 to 1), then linear regression will have difficulty. Regression assumes that the response is unbounded and has a range of negative infinity to positive infinity. You can use a transformation of the response that is built into the Fit Model launch dialog that should remedy the disparity. The Logit transform is Log( satisfaction / (1-satisfaction) ). (Note that function is the natural logarithm).

Simply select the response in the Y role and then click the red triangle near the bottom center for Transforms and select Logit. It works like this:

Capture.PNG

This example has a response Y that is continuous but bounded between 0 and 1 like your response. The predictor X is normally distributed. Now set up the Fit Model launch I as instructed:

Capture.PNG

I am using a linear model of second order but that fact is not important. This approach works with any linear predictor such as your model. Click Run:

Capture.PNG

(Note that I first changed the Emphasis setting to Minimal Report.)

You can see that this transformation helps the regression deal with the lower and upper bounds of satisfaction.

dale_lehman
Level VII

Re: Help with multiple regression

Mark's answer is very good - I hadn't even thought that the satisfaction level might be a continuous variable ranging from 0 to 1 (that's not a scale I've often seen).  So, if it is indeed continuous, then regression models can be run.  I'd be very careful interpreting the coefficients in such a model - the temptation to describe them as causal may be incorrect.  For example, a coefficient may describe the increase in satisfaction associated with each year of experience.  It would be tempting, if this is positive, to interpret it as how each year of experience contributes to job satisfaction.  But it would be natural to believe that people with more experience would be more satisfied with their job (a type of survival bias).  It is not clear what finding a significant association really means.  Similar issues arise for the other independent variables.

Re: Help with multiple regression

Have you seen Help > Books > Fitting Linear Models?

tranquilo123
Level I

Re: Help with multiple regression

Indicator Function Parameterization

 

Term

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

0,6116566

0,001994

306,75

<,0001*

Std last_evaluation

0,0455447

0,00211

21,58

<,0001*

Std number_project

-0,047008

0,002134

-22,03

<,0001*

Std time_spend_company

-0,022359

0,002022

-11,06

<,0001*

promotion_last_5years[1-0]

0,0553393

0,013707

4,04

<,0001*

 

Please help me interpret this :)