Solved: Lack of fit - significant R square in two "almost similar x/y models"

Marc · May 12, 2017 2:04 AM

Hello.

I know a similar question was asked before, but please help me to apply this to my model:

1. test two regressions - A versus B and A versus C (similar observations n=1242)

2. Model A versus B has an higher Rsquare and seems to have a better correlation - but a Lack of fit < 0.05

3. Model A versus C has a lower Rsquare and sems to have a bit less correlation - but a lack of fit > 0.05

4. The residual plot looks quite similar.

For the actual full result please see also my attached PDF.

My question: Do I have to disregard Model A versus B and conclude that A versus C is better and more appropriate for my analysis ?

David_Burnham · May 12, 2017 05:17 AM

Based on a brief glance of the models my feeling is that you are worrying too much about the detailed statistics. The 2 models look almost identical to me (look at the bivariate plots not just the stats) - clearly there is a very high level of correlation between B and C. Does your scientific understanding favour one model over another?

What stands out to me are the high leverage points at a value of about 70 (for both B and C - strange that they are both on the same scale - are they different measures of the same thing?). Anyhow, I would be concerned about the degree of leverage of those points and their overall influence on the regression.

-Dave

View solution in original post

David_Burnham · May 12, 2017 05:17 AM

Based on a brief glance of the models my feeling is that you are worrying too much about the detailed statistics. The 2 models look almost identical to me (look at the bivariate plots not just the stats) - clearly there is a very high level of correlation between B and C. Does your scientific understanding favour one model over another?

What stands out to me are the high leverage points at a value of about 70 (for both B and C - strange that they are both on the same scale - are they different measures of the same thing?). Anyhow, I would be concerned about the degree of leverage of those points and their overall influence on the regression.

-Dave

Marc · May 12, 2017 05:37 AM

Many thanks David.
All are physiological variables, all measurements are done with the same method - hence trying to find the best predictor for A. B and C are dependant to each other.
I will explore the outliers separately in a later, but before this I wanted to configure the best model to define outliers (>2SD or >90% percentile) - although they maybe the same in each model anyway (I didn't check this yet).
Thanks a lot, Marc

Lack of fit - significant R square in two "almost similar x/y models"

Re: Lack of fit - significant R square in two "almost similar x/y models"

Re: Lack of fit - significant R square in two "almost similar x/y models"

Re: Lack of fit - significant R square in two "almost similar x/y models"