Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Very High R-Square Value

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Very High R-Square Value

Apr 19, 2017 3:29 PM
(1343 views)

Hi,

I am trying to calulate R-Square by using Step Wise regression in the attached DataSet

Till now I have cleaned the data by excluding the outliers and also by removing variables :

Garage yrblt ,1stflsf,totrmsabvgrd & Garagecars as they have very high correlation.

Recalculated the Lot frontage as Updated Lot Frontage as lot frontage had a lot of missing variables.

My problem is the that the R-Square Value in the Step Wise regression, Forward, minimum BIC is 1.

What am i doing wrong??

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very High R-Square Value

I tried to reproduce your results and I do not get R squared = 1. I get something like 0.87, so i'm not sure what you are doing that is different. Also, there are a number of things from your description that I would not recommend - too many variables are trying to meausre the same sort of thing and using 80 something variables when you have 1400 data points seems like a poor idea to me. I also don't recommend deleting outliers - you might try log transformations instead.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very High R-Square Value

I can only think of two reasons you would get R squared = 1. You might have accidentilly included the house id as a nominal variable (althought it looks continuous in your dataset) as a factor - but you should have received a bunch of warnings in the regression results. You could have accidentilly included the sales price as a factor (by shift-clicking when you added variables). Either could produce that result, but I can't think of anything else.