cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Amir_H
Level III

Significant Lack of Fit

Hi All, I appreciate your time and response. I have built a model using main effects, interactions, and sometimes 2nd orders. I have used Standard Least Squares and 60% of the data. About 20% is used for validation and 20% for test sets.

R2, adj R2, and those for validation and test sets are all about 98-99%. VIFs are all between 1 to 2.

The only issue is I have a significant lack of fit.

What can I do here? Can I ignore the LoF?

 

PS: I noticed when I find a model from GenReg (pruned or Lasso, etc.) and I use its model terms and try them in Standard Least Square, I get almost the same model and with a significant Lack of Fit. Does that mean it's not taken into account in GenReg?

4 REPLIES 4
P_Bartell
Level VIII

Re: Significant Lack of Fit

Significant LOF can come about from a variety of sources. A few questions:

 

1. Do your residual plots give some clues as to the source of LOF? Do the plots show any non random structure or suspicious patterns?

2. What is the purpose of the model? Explanatory or prediction? If it's explanatory, LOF is probably not as problematic. If it's predictive, how much does the LOF condition impact the utility of the model to make predictions across the factor space compared to the magnitude of the 'wrongness' of the predictions. If the magnitude of the wrongness of the predictions isn't practically impactful, I wouldn't be as worried about living with a model that has significant LOF.

 

Amir_H
Level III

Re: Significant Lack of Fit

Thanks for the quick reply @P_Bartell. Please see the attached picture for residuals. I don't think they look problematic?!

The model is going to be used for both optimizations, currently, and in the near future for predictions. The wrongness of prediction is not going to be too impactful.

 

Amir_H_0-1618331766770.png

 

P_Bartell
Level VIII

Re: Significant Lack of Fit

How do the residuals vs. predictor variable plots appear? This can be a hint to missing terms or effects in the model. And the residual vs. run order (if you ran a designed experiment) appear? This can indicate a non stable variance or some other lurking variables effect that may have entered the experimental execution event which could also lead to LOF.

 

From the looks of your Predicted vs. Residual plot methinks the primary cause of the LOF is a relatively small variance for most of the replicate points. This makes a really small Pure Error term vs. the LOF error term in the LOF ANOVA and makes a 'significant LOF' p-value more likely.

 

As you state the magnitude of the difference between the actual and predicted values is not practically significant so if the model can still be used for it's intended purpose...oh well, you've LOF. Who cares?

Amir_H
Level III

Re: Significant Lack of Fit

Thanks again for the kind answer @P_Bartell , I agree with the relatively small variance for most of the replicate points.

To answer your questions, I have included another screenshot.

 

Amir_H_0-1618346002391.png

Amir_H_1-1618346021081.png

Amir_H_2-1618346051279.png

Amir_H_3-1618346073911.png

Amir_H_4-1618346091288.png

Amir_H_5-1618346114731.png

Amir_H_9-1618346322260.png

Amir_H_6-1618346132758.png

Amir_H_7-1618346158766.png

 

Amir_H_8-1618346195110.png