Hello all. I have a very high value in a univariate dataset and wanted to test if it were an outlier. I have downloaded and used the Grubbs outlier test script. However, the assumption for the Grubbs outlier test is that the data come from a normal distribution -- but I am unclear from the references I have read if this refers to the raw data or the residuals. It has been suggested that what could look like an outlier could occur if the data are distributed log-normally.
If anyone has used this test before, could you specify if I need to test the distribution of my raw data or of my dataset's residuals? I'm assuming for either of these, I should be testing the distribution including the value I suspect. Thanks!
The script adds the result of the Grubb's test to a Distribution platform and opens the normal quantile plot to so that you can assess normality of the sample, inferring that the population is normally distributed. If the data are otherwise normally distributed but contain a discordant outlier, it might fail a normality test but you should still see linearity in the plot. Regardless of the outlier, non-normal data should not appear linear in this plot. The normal quantile plot should make it clear if this is the case.
If this example is univariate, then where do 'residuals' come from?