cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
AnnaPaula
Level III

Normality assumption in Fit Least Squares

Hi, 

 

I have a simple question that I was wondering if people with more experience could shed some light on.

I know that OLS does not necessarily require normality assumption, but normality on the residuals leads to unbiased estimates with minimum variance.

 

I tried to fit a model on a continuous response with three independent variables using Least Squares in the Fit Model platform. This model had a Radj of about 0.45 and the residuals plot exhibited a clear pattern.

Next, I check the distribution of the continuous response and saw that it was far from normal.

Using Generalized Regression (Lasso) and Normal response distribution, the model still led to a Generalized Rsquare of about 0.45. However, when I switched to an exponential response distribution, the Rsquare increased to 0.84.

 

If normality on the response is not needed on Least Squares, what could explain such a difference?

 

Thank you!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Normality assumption in Fit Least Squares

Ordinary least squares regression is often treated as a special case with regard to the distribution of the response. As @Thierry_S explained, we usually address the residuals (estimated errors) assuming that the errors are normally distributed. You could also describe it as a conditional distribution: Y ~ Normal( mean = predicted Y = f(X), variance ). So the mean Y varies with X as described by the linear regression model but the variance of Y is constant, or independent of the response. In other words, the response exhibits a normal distribution of errors for any give mean response (predicted).

 

I would not say that "OLS does not necessarily require normality."

 

The normality assumption and OLS lead to a poor model if the conditional distribution is not normal. So the exponential distribution appears to be model for the variance than the normal distribution in your case.

View solution in original post

4 REPLIES 4
AnnaPaula
Level III

Re: Normality assumption in Fit Least Squares

By the way, I found this material (https://community.jmp.com/kvoqx44227/attachments/kvoqx44227/discovery-2019-content/28/1/glmTalkTucso... where it says that assuming the errors (and response) are normal makes life a lot easier. However, in this link (https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-regression/simple-linear-regression-as..., it discusses normality of the errors only.

So I am not sure if normality of the response is assumed in JMP?

If yes, why would normality of the response "make life a lot easier"? 

 

Thierry_S
Super User

Re: Normality assumption in Fit Least Squares

Hi Anna,
The actual assumption for the Fit Least Square platform is that the model Residuals are distributed normally; there is no assumption for the response or the variables to be normally distributed. I'm not sure what phrase "make life a lot easier" refers to in terms of statistics.
Best,
TS
Thierry R. Sornasse

Re: Normality assumption in Fit Least Squares

Ordinary least squares regression is often treated as a special case with regard to the distribution of the response. As @Thierry_S explained, we usually address the residuals (estimated errors) assuming that the errors are normally distributed. You could also describe it as a conditional distribution: Y ~ Normal( mean = predicted Y = f(X), variance ). So the mean Y varies with X as described by the linear regression model but the variance of Y is constant, or independent of the response. In other words, the response exhibits a normal distribution of errors for any give mean response (predicted).

 

I would not say that "OLS does not necessarily require normality."

 

The normality assumption and OLS lead to a poor model if the conditional distribution is not normal. So the exponential distribution appears to be model for the variance than the normal distribution in your case.

AnnaPaula
Level III

Re: Normality assumption in Fit Least Squares

Thank you @Thierry_S and @Mark_Bailey!

I thought the normality of the residuals assumption would lead to unbiased estimates and shorter CI of the estimates. I was not sure it was a hard requirement for the proper fit of the model. This makes sense