- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Outliers more than 3 standard deviations in JMP
Hi,
Is there any way to detect outliers more than 3 stansard deviations in JMP?
I should say that I am familiar with the "explore outliers option as well as "Levey Jennings" Control chart. but, is there other way to detect outliers more than 3SDs?
thanks,
Ne
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Outliers more than 3 standard deviations in JMP
@NeF wrote:Hi,
Is there any way to detect outliers more than 3 stansard deviations in JMP?
I should say that I am familiar with the "explore outliers option as well as "Levey Jennings" Control chart. but, is there other way to detect outliers more than 3SDs?
thanks,
Ne
This can be solved by creating a new formula column with the formula:
If( :Data > Col Mean( :Data ) + Col Std Dev( :Data ) * 3,
1,
0
)
The data column is your source column.
In case you want the vaules > 3SDs, the formula is:
If( :Data > Col Mean( :Data ) + Col Std Dev( :Data ) * 3,
:Data,
"."
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Outliers more than 3 standard deviations in JMP
Dear Thomas1,
Thank you for your kind reply!
Ne
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Outliers more than 3 standard deviations in JMP
My second formula, which should show the values > 3 SDs contains an error. Therefore it doesn’t work. You have to replace "*" against *
So the correct formula, in order to get the values, is:
If( :Data > Col Mean( :Data ) + Col Std Dev( :Data ) * 3,
:Data,
.
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Outliers more than 3 standard deviations in JMP
hi,
thanks a lot. I've noticed that while using it yesterday.
best,
Ne
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Outliers more than 3 standard deviations in JMP
Just a simple addendum to @Thomas1 response. If you have severe outlier, the Col Std Dev can be pretty large and can bias the mean.
An alternative is to use the median + k* pseudo sigma for the upper screening limit and median - k* pseudo sigma fro the lower screening limit.
Here is a the column fomula for a column named weight. For raw data, sometimes quantiles 0.85 and 0.15 are used to compute the pseudo sigma with 6 as the multiplier for ps.
Local( {ps}, ps = (Col Quantile( :weight, 0.75 ) - Col Quantile( :weight, 0.25 )) / 1.349; If( :weight > Col Quantile( :weight, 0.5 ) + 5 * ps, 1, :weight < Col Quantile( :weight, 0.5 ) - 5 * ps, -1, 0 ); )
If you are scripting, the Distribution platform computes a Robust Mean and Robust Std Dev. Here is a simple script to get these values.
Names Default To Here(1);
dt = Open( "$sample_data/Big class.jmp");
dist = dt << Distribution(
Continuous Distribution(
Column( :height ),
Quantiles( 0 ),
Horizontal Layout( 1 ),
Histogram( 0 ),
Vertical( 0 ),
Outlier Box Plot( 0 ),
Customize Summary Statistics(
Trimmed Mean( 1 ),
Robust Mean( 1 ),
Robust Std Dev( 1 ),
Set Alpha Level( 0.05 )
)
)
);
snames = report(dist)["Summary Statistics"][TableBox(1)][StringColBox(1)] << get;
svalues= report(dist)["Summary Statistics"][TableBox(1)][NumberColBox(1)] << get;
stats = Associative Array(snames, svalues); //cretea a keyed list
r_xb = stats["Robust Mean"];
r_sd = stats["Robust Standard Deviation"];
show(r_xb, r_sd);
//now use r_xb + <4|5|6> * r_sd and r_xb - <4|5|6> * r_sd for screening limits
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Outliers more than 3 standard deviations in JMP
Its really helpful!
thanks,
Nehai