Discussions

AnnaPaula · Apr 4, 2021 05:28 PM

Hi,

I have a simple question that I was wondering if people with more experience could shed some light on.

I know that OLS does not necessarily require normality assumption, but normality on the residuals leads to unbiased estimates with minimum variance.

I tried to fit a model on a continuous response with three independent variables using Least Squares in the Fit Model platform. This model had a Radj of about 0.45 and the residuals plot exhibited a clear pattern.

Next, I check the distribution of the continuous response and saw that it was far from normal.

Using Generalized Regression (Lasso) and Normal response distribution, the model still led to a Generalized Rsquare of about 0.45. However, when I switched to an exponential response distribution, the Rsquare increased to 0.84.

If normality on the response is not needed on Least Squares, what could explain such a difference?

Thank you!

Mark_Bailey · Apr 5, 2021 12:53 PM

Ordinary least squares regression is often treated as a special case with regard to the distribution of the response. As @Thierry_S explained, we usually address the residuals (estimated errors) assuming that the errors are normally distributed. You could also describe it as a conditional distribution: Y ~ Normal( mean = predicted Y = f(X), variance ). So the mean Y varies with X as described by the linear regression model but the variance of Y is constant, or independent of the response. In other words, the response exhibits a normal distribution of errors for any give mean response (predicted).

I would not say that "OLS does not necessarily require normality."

The normality assumption and OLS lead to a poor model if the conditional distribution is not normal. So the exponential distribution appears to be model for the variance than the normal distribution in your case.

View solution in original post

AnnaPaula · Apr 4, 2021 05:33 PM

By the way, I found this material (https://community.jmp.com/kvoqx44227/attachments/kvoqx44227/discovery-2019-content/28/1/glmTalkTucso... where it says that assuming the errors (and response) are normal makes life a lot easier. However, in this link (https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-regression/simple-linear-regression-as..., it discusses normality of the errors only.

So I am not sure if normality of the response is assumed in JMP?

If yes, why would normality of the response "make life a lot easier"?

Thierry_S · Apr 5, 2021 01:17 AM

Hi Anna,
The actual assumption for the Fit Least Square platform is that the model Residuals are distributed normally; there is no assumption for the response or the variables to be normally distributed. I'm not sure what phrase "make life a lot easier" refers to in terms of statistics.
Best,
TS

Thierry R. Sornasse

Mark_Bailey · Apr 5, 2021 12:53 PM

Ordinary least squares regression is often treated as a special case with regard to the distribution of the response. As @Thierry_S explained, we usually address the residuals (estimated errors) assuming that the errors are normally distributed. You could also describe it as a conditional distribution: Y ~ Normal( mean = predicted Y = f(X), variance ). So the mean Y varies with X as described by the linear regression model but the variance of Y is constant, or independent of the response. In other words, the response exhibits a normal distribution of errors for any give mean response (predicted).

I would not say that "OLS does not necessarily require normality."

The normality assumption and OLS lead to a poor model if the conditional distribution is not normal. So the exponential distribution appears to be model for the variance than the normal distribution in your case.

AnnaPaula · Apr 8, 2021 02:22 AM

Thank you @Thierry_S and @Mark_Bailey!

I thought the normality of the residuals assumption would lead to unbiased estimates and shorter CI of the estimates. I was not sure it was a hard requirement for the proper fit of the model. This makes sense

Discussions

Normality assumption in Fit Least Squares

Re: Normality assumption in Fit Least Squares

Re: Normality assumption in Fit Least Squares

Re: Normality assumption in Fit Least Squares

Re: Normality assumption in Fit Least Squares

Re: Normality assumption in Fit Least Squares

Recommended Articles