turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Lack of Fit with a Fit Y by X, and testing Residuals for normality?

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 21, 2009 11:04 PM
(1490 views)

Hi,

I want to do a simple regression of an organisms' mass (x) by its consumption rate (Y)

- I have four species, and thus am doing 4 separate plots w/ mass (x) consumption rate (Y) and selecting "By" Species.

However when I do this, my Lack of Fit shows up as significant.

When I transform X and Y w/ "Logit", and use a Polynomial fit then my lack of fit is not-significant anymore.

However.. if I look at the distribution of my Residuals- They are NoT normal...

I thought you always had to have a normal distribution of residuals or else something is wrong in your model- is that correct?

Also why does my lack of fit show up anyhow if I only have one X category? I do have repeated x values so maybe that is why.

How do I fix this, and do I need to look at the residuals after I fit the line?

Thank you- (Dec. 22.2009) please respond asap

I want to do a simple regression of an organisms' mass (x) by its consumption rate (Y)

- I have four species, and thus am doing 4 separate plots w/ mass (x) consumption rate (Y) and selecting "By" Species.

However when I do this, my Lack of Fit shows up as significant.

When I transform X and Y w/ "Logit", and use a Polynomial fit then my lack of fit is not-significant anymore.

However.. if I look at the distribution of my Residuals- They are NoT normal...

I thought you always had to have a normal distribution of residuals or else something is wrong in your model- is that correct?

Also why does my lack of fit show up anyhow if I only have one X category? I do have repeated x values so maybe that is why.

How do I fix this, and do I need to look at the residuals after I fit the line?

Thank you- (Dec. 22.2009) please respond asap

2 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

First, recognize that transforming the response also transforms the error distribution.

Next, in your output window, click on the ? (question mark) at the top then click on the Lack of Fit section. That calls up the Help window for that part of the output. Help windows have a lot of good information on the displays and the underlying statistics. The Help for Lack of Fit starts by noting you need multiple observations at each x level (which you have). It also says that if you only have a few replicates, the lack of fit test is not very useful.

Next, in your output window, click on the ? (question mark) at the top then click on the Lack of Fit section. That calls up the Help window for that part of the output. Help windows have a lot of good information on the displays and the underlying statistics. The Help for Lack of Fit starts by noting you need multiple observations at each x level (which you have). It also says that if you only have a few replicates, the lack of fit test is not very useful.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Some comments:

First why are you transforming? I will quote George Box "Transformation is only to be used to simplify the model"...in other words don't transform to make the statistics better. If you expected a linear relationship, why isn't linear? What hypotheses do you have to explain the data (of course with the new hypotheses you will need another data set, but that is iteration)?...

So regarding residuals...just think about it....the model should be wrong equally on both sides (+/-) and have a mean of 0 and be normally distributed (NID(0,sigma squared))...If you don't meet this, it is likely your equation is poor.

First why are you transforming? I will quote George Box "Transformation is only to be used to simplify the model"...in other words don't transform to make the statistics better. If you expected a linear relationship, why isn't linear? What hypotheses do you have to explain the data (of course with the new hypotheses you will need another data set, but that is iteration)?...

So regarding residuals...just think about it....the model should be wrong equally on both sides (+/-) and have a mean of 0 and be normally distributed (NID(0,sigma squared))...If you don't meet this, it is likely your equation is poor.