Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar

Grubbs' Outlier Test (Version 2)

(This script is a new version that provides By group processing. Finally! Note that the p-value reported in the first version is no longer available.)

This script adds the two-tailed outlier test by Grubbs to the Distribution platform. The normal quantile plot and Goodness of Fit test are opened to help assess the assumption that the sample was drawn from a normal population.

Simply open the data table with the numeric variable to be evaluated, then open and run the script. Select the data column and click Y, Data. Specify the desired level of significance (alpha is 0.05 by default). This example is based on the variable height in the Big Class data table in the Sample Data folder.

9210_Capture 1.jpg

Click OK.

9211_Capture 2.jpg

The pattern of the markers in the normal quantile plot appears to be linear and none of the markers is outside of the region designated by the dotted read curves. The sample and, therefore, the population are judged to be normal in this case. The test at the bottom of the platform is not significant at the specified level.

Now the same analysis is performed using a By variable. This example uses sex as the grouping variable.

9212_Capture 3.jpg

Click OK.

9213_Capture 4.jpg


Hi Mark

Thanks for posting this.

I would be great if it could be done BY a grouping variable.

BR, Marianne

If the data set happens to have any missing values, this code incorrectly calculates N for the Grubbs test.  This can be problematic if the number of missing columns is rather large.

See the following correction:

lines 76 and 110

n = N rows( yVal ) - N missing ( yVal);

the script breaks if the by variable is numeric.  A simple way to fix this is to change bcol to character data type after the line

dt = Current Data Table();
If( N Items( bCol ),
               bCol[1] << set data type(character);


/* you could also fix the script to work with numeric By varaibles withotu cahnging them, but thsi seemed simpler and you can always change it back to numeric at the end */





Article Tags