Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Lack of fit - significant R square in two "almost ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

May 12, 2017 1:56 AM
(1057 views)

Hello.

I know a similar question was asked before, but please help me to apply this to my model:

1. test two regressions - A versus B and A versus C (similar observations n=1242)

2. Model A versus B has an higher Rsquare and seems to have a better correlation - but a Lack of fit < 0.05

3. Model A versus C has a lower Rsquare and sems to have a bit less correlation - but a lack of fit > 0.05

4. The residual plot looks quite similar.

For the actual full result please see also my attached PDF.

My question: Do I have to disregard Model A versus B and conclude that A versus C is better and more appropriate for my analysis ?

2 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

May 12, 2017 2:17 AM
(1051 views)

Based on a brief glance of the models my feeling is that you are worrying too much about the detailed statistics. The 2 models look almost identical to me (look at the bivariate plots not just the stats) - clearly there is a very high level of correlation between B and C. Does your scientific understanding favour one model over another?

What stands out to me are the high leverage points at a value of about 70 (for both B and C - strange that they are both on the same scale - are they different measures of the same thing?). Anyhow, I would be concerned about the degree of leverage of those points and their overall influence on the regression.

-Dave

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

May 12, 2017 2:37 AM
(1048 views)

All are physiological variables, all measurements are done with the same method - hence trying to find the best predictor for A. B and C are dependant to each other.

I will explore the outliers separately in a later, but before this I wanted to configure the best model to define outliers (>2SD or >90% percentile) - although they maybe the same in each model anyway (I didn't check this yet).

Thanks a lot, Marc