Subscribe Bookmark RSS Feed

Lack of Fit with a Fit Y by X, and testing Residuals for normality?

Hi,
I want to do a simple regression of an organisms' mass (x) by its consumption rate (Y)
- I have four species, and thus am doing 4 separate plots w/ mass (x) consumption rate (Y) and selecting "By" Species.

However when I do this, my Lack of Fit shows up as significant.

When I transform X and Y w/ "Logit", and use a Polynomial fit then my lack of fit is not-significant anymore.

However.. if I look at the distribution of my Residuals- They are NoT normal...

I thought you always had to have a normal distribution of residuals or else something is wrong in your model- is that correct?

Also why does my lack of fit show up anyhow if I only have one X category? I do have repeated x values so maybe that is why.

How do I fix this, and do I need to look at the residuals after I fit the line?
Thank you- (Dec. 22.2009) please respond asap
2 REPLIES
rchertzy

Community Trekker

Joined:

Jun 23, 2011

First, recognize that transforming the response also transforms the error distribution.
Next, in your output window, click on the ? (question mark) at the top then click on the Lack of Fit section. That calls up the Help window for that part of the output. Help windows have a lot of good information on the displays and the underlying statistics. The Help for Lack of Fit starts by noting you need multiple observations at each x level (which you have). It also says that if you only have a few replicates, the lack of fit test is not very useful.
statman

Community Trekker

Joined:

Jun 23, 2011

Some comments:

First why are you transforming? I will quote George Box "Transformation is only to be used to simplify the model"...in other words don't transform to make the statistics better. If you expected a linear relationship, why isn't linear? What hypotheses do you have to explain the data (of course with the new hypotheses you will need another data set, but that is iteration)?...

So regarding residuals...just think about it....the model should be wrong equally on both sides (+/-) and have a mean of 0 and be normally distributed (NID(0,sigma squared))...If you don't meet this, it is likely your equation is poor.