- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Diagnostics with Studentized Residuals
Hi all,
I am trying to identify the outliers of a REML mixed model that includes two nominal fixed factors and one random factor. As shown in the screenshots below, the parity plot, residual by row plot and residual by predicted plot all clearly flagged six observations as strong outliers (black points). However, these outliers are not captured in studentized residual plot where their residuals are comparable to others (grey points) and are within the +/- 3 limits. In addition, by removing three of the outliers and refitting the model, everything looks completely fine and the other three observations initially identified as outliers are now well predicted by the model. Any inputs what might be the root causes for these cases? Thank you.
Refit the model after removing three of the outliers
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Diagnostics with Studentized Residuals
First, welcome to the community.
My thoughts:
There are a number of statistics used to evaluate a given model (and its adequacy). Plots of residuals (including studentized) may be very helpful in identifying outliers. No one plot is the best in all circumstances to do this. When plots identify these potentially unusual points you must remember it is not the actual data that is unusual, but that the mode did a poor job of predicting the actual data point. This is an indicator the model may need to be re-evaluated (and perhaps more importantly you may get a better understanding of the true mechanisms/causal relationships at work). Also remember, the model and all statistics associated with the evaluation of the model (RMSE, p-values, R-square-R-square adjusted delta, etc.) are ALL CONDITIONAL. Change what is in the model or what estimates the MSE or the inference space , etc. and the model adequacy can/will change (hence why when you removed data, a new model was created and changed the residual plots). If outliers are identified, I always use practical significance first, then it is possible the terms in the model do not adequately predict actual values. This often is the result of the effect of noise in the system and possibly inconsistent noise. When plotting the residuals by row, always make sure the data is first sorted in run order. This may offer clues as to when the model has issues.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Diagnostics with Studentized Residuals
First, welcome to the community.
My thoughts:
There are a number of statistics used to evaluate a given model (and its adequacy). Plots of residuals (including studentized) may be very helpful in identifying outliers. No one plot is the best in all circumstances to do this. When plots identify these potentially unusual points you must remember it is not the actual data that is unusual, but that the mode did a poor job of predicting the actual data point. This is an indicator the model may need to be re-evaluated (and perhaps more importantly you may get a better understanding of the true mechanisms/causal relationships at work). Also remember, the model and all statistics associated with the evaluation of the model (RMSE, p-values, R-square-R-square adjusted delta, etc.) are ALL CONDITIONAL. Change what is in the model or what estimates the MSE or the inference space , etc. and the model adequacy can/will change (hence why when you removed data, a new model was created and changed the residual plots). If outliers are identified, I always use practical significance first, then it is possible the terms in the model do not adequately predict actual values. This often is the result of the effect of noise in the system and possibly inconsistent noise. When plotting the residuals by row, always make sure the data is first sorted in run order. This may offer clues as to when the model has issues.