cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMPĀ® Marketplace
Choose Language Hide Translation Bar
peerr12
Level I

normalcy test for residuals

 I am plotting Y vs X (see attachment). The regression line apperas to be quite good. R seq > 0.99. 

However when I make a residul plot, I get an inverted U. 

Some people say that such a residual plot point to that the Regression line is not good enough.

 

My question is: 

 

1. Is there some way to improve the residual plot and the Leniar regression (may be by changing intercept ?). 

 

2. I performed a test of normal distribution for the residuals. This test confirms that the residuals are normally distributed. Is this some how evidence that the lenier regression line is indeed good ? 

From what  read in literature, it appears that if the test of normaly of residuals confirms by 95% CI, then the regression line is good. 

 

3. Could some one provide a short feedback on how to interpret "residual normal quantile plot" ? 

 

 

Would be obliged if some one could provide a kind feedback. 

 Regards

 

8 REPLIES 8
txnelson
Super User

Re: normalcy test for residuals

Residual Plot as you are showing suggests to me that you need to do a Polynomial regression.

Jim
peerr12
Level I

Re: normalcy test for residuals

Does this mean that the lenier regression is not valid  ? 

In fact, I am expected to show leniarity. So does that mean that the process is actually not at all lenier ? 

 

Is there another way to check this ? 

 

txnelson
Super User

Re: normalcy test for residuals

Your linear regression is significant, and it's R2 is excellent, but, a review of the scatterplot hints that there is a a curvature to the data.  And when a polynomial regresson was run, the R2 is stronger, and the RMSE is smaller, incating the polynomial is a better predictor that a linear regression.  So I would personally go for the polynomial.

Jim
peerr12
Level I

Re: normalcy test for residuals

Dear Friend Jim, 

Thanks for your kind feedback. The fact is that I am obliged to confirm that the process is linear. Now the question is, if I use a polynomial to fit a line, then can I still call this a linear process or is this then not a linear process ?

 

Thanks again for your kind help. 

txnelson
Super User

Re: normalcy test for residuals

Given the data that you have, it suggests that the process is curvilinear, not linear.

Jim
peerr12
Level I

Re: normalcy test for residuals

Is there some way to change the regeression line in a way that the residual-plot gets optimised ? 

txnelson
Super User

Re: normalcy test for residuals

In general terms, the better your regression, the better smaller your residuals will be, thus they will become more optimal.  When the model you are dealing with is a Multiple Linear Regression, one can use a Stepwise process to find the best regression for the smallest number of predictors.  But in your case you can only deal with polynomial regressions.  You could create your Y*Y and Y*Y*Y columns and even Y*Y*Y*Y, and then do a Stepwise regression to see if you have a significant higher level polynomial, but with only a fixed number of X levels, I don't believe that will be beneficial.

Jim

Re: normalcy test for residuals

I concur with Jim. The regression analysis clearly shows lack of fit with the first-order model. The second-order model is statistically significant. The Area response to change in Conc is not linear. The pattern in the residuals from the first-order model suggests two linear sections as if there were a break instead of a curve, although the second-order model seems to account for it well.

 

First thought: why is this relationship supposed to be linear? (Process or system requirement? Theoretical prediction?) What is the tolerance for non-linearity?

 

Second thought:the replicates exhibit very small variation so that the LOF test is always significant, even with a fourth-order model in which the 3rd and 4th order terms not significant. Are they true replicates? That is, did you prepare three samples each time or merely measure the same sample three times?

 

Third thought:how were Area and Conc values determined or measured?

 

Fourth thought: this study appears to me to be a calibration or standard curve. As such, you might find the linearity study in the Variability Chart useful as well as the Measurement System Analysis. See Help > Books > Quality and Process for more information about these platforms and their methods.