Choose Language Hide Translation Bar
Highlighted
NeF
NeF
Level III

Outliers more than 3 standard deviations in JMP

Hi, 

Is there any way to detect outliers more than 3 stansard deviations in JMP?  

I should say that I am familiar with the "explore outliers option as well as "Levey Jennings" Control chart. but, is there other way to detect outliers more than 3SDs? 

thanks, 

Ne

6 REPLIES 6
Highlighted
Thomas1
Level V

Re: Outliers more than 3 standard deviations in JMP

 

 


@NeF wrote:

Hi, 

Is there any way to detect outliers more than 3 stansard deviations in JMP?  

I should say that I am familiar with the "explore outliers option as well as "Levey Jennings" Control chart. but, is there other way to detect outliers more than 3SDs? 

thanks, 

Ne

 

This can be solved by creating a new formula column with the formula:

 

If( :Data > Col Mean( :Data ) + Col Std Dev( :Data ) * 3,
    1,
    0
)

 

The data column is your source column.

 

In case you want the vaules > 3SDs, the formula is:

 

If( :Data > Col Mean( :Data ) + Col Std Dev( :Data ) * 3,
    :Data,
    "."
)


 

Highlighted
NeF
NeF
Level III

Re: Outliers more than 3 standard deviations in JMP

Dear Thomas1, 

Thank you for your kind reply!

Ne

Highlighted
Thomas1
Level V

Re: Outliers more than 3 standard deviations in JMP

My second formula, which should show the values > 3 SDs contains an error.  Therefore it doesn’t work. You have to replace "*" against *

 

So the correct formula, in order to get the values, is:

 

If( :Data > Col Mean( :Data ) + Col Std Dev( :Data ) * 3,

    :Data,

    .

)

 

Highlighted
NeF
NeF
Level III

Re: Outliers more than 3 standard deviations in JMP

hi, 

thanks a lot. I've noticed that while using it yesterday.

best, 

Ne

Highlighted
gzmorgan0
Super User

Re: Outliers more than 3 standard deviations in JMP

Just a simple addendum to @Thomas1 response. If you have severe outlier, the Col Std Dev can be pretty large and can bias the mean.

 

An alternative is to use the median + k* pseudo sigma for the upper screening limit and median - k* pseudo sigma fro the lower screening limit.

 

Here is a the column fomula for a column named weight. For raw data, sometimes quantiles 0.85 and 0.15 are used to compute the pseudo sigma with 6 as the multiplier for ps. 

 

 

Local( {ps},
	ps = (Col Quantile( :weight, 0.75 ) - Col Quantile( :weight, 0.25 )) / 1.349;
	If(
		:weight > Col Quantile( :weight, 0.5 ) + 5 * ps, 1,
		:weight < Col Quantile( :weight, 0.5 ) - 5 * ps, -1,
		0
	);
)

If you are scripting, the Distribution platform computes a Robust Mean and Robust Std Dev.  Here is a simple script to get these values. 

Names Default To Here(1);

dt = Open( "$sample_data/Big class.jmp");

dist = dt << Distribution(
	Continuous Distribution(
		Column( :height ),
		Quantiles( 0 ),
		Horizontal Layout( 1 ),
		Histogram( 0 ),
		Vertical( 0 ),
		Outlier Box Plot( 0 ),
		Customize Summary Statistics(
			Trimmed Mean( 1 ),
			Robust Mean( 1 ),
			Robust Std Dev( 1 ),
			Set Alpha Level( 0.05 )
		)
	)
);

snames = report(dist)["Summary Statistics"][TableBox(1)][StringColBox(1)] << get;
svalues= report(dist)["Summary Statistics"][TableBox(1)][NumberColBox(1)] << get;
stats = Associative Array(snames, svalues);  //cretea a keyed list

r_xb = stats["Robust Mean"];
r_sd = stats["Robust Standard Deviation"];

show(r_xb, r_sd);

//now use r_xb + <4|5|6> * r_sd  and  r_xb - <4|5|6> * r_sd for screening limits

 

 

 

Highlighted
NeF
NeF
Level III

Re: Outliers more than 3 standard deviations in JMP

Hi gzmorgan0,
Its really helpful!
thanks,
Nehai
Article Labels

    There are no labels assigned to this post.