BookmarkSubscribeRSS Feed
Choose Language Hide Translation Bar
rshehadah
Contributor

Outliers in data

Hi,

 

Is there a way in JMP to label outliers in a dataset. For example, I have 100 lots and I want to label the units that are observered on the lower side of the distrubution. I dont care if the outliers perform better than the distrubution, I am more intersted in the lower ones. 

 

In other words for example, I have a lot with 100 units, 1 or 2 units are outliers. How can I label those units? 

 

Thank you,

Rami

0 Kudos
3 REPLIES 3
msharp
Super User

Re: Outliers in data

The easiest way would be to just create a new "outlier" column and mark the rows that are outliers.  You can then make that column a "label" column.  Once you have another column you can use it to color or use different marker styles (like x vs dot).  Since it's a label column you can highlight the points and show the row label.

 

outlier.png

0 Kudos
Highlighted
rshehadah
Contributor

Re: Outliers in data

Thank you for the reply, but I dont see how I can do that for over 100 lots. Also how would you know those two points are outliers?
0 Kudos
msharp
Super User

Re: Outliers in data

Labelling outliers and finding outliers are two completely different questions.  There are lots of statitistical methods to determine outliers (Pierce, Grubbs, 3 sigma, box and whisker plots, ect)  all of which vary and disagree.  You can use the Analyze > Screening > Explore Outliers tool for this. 

 

That said, I always put in a word of caution around outliers.  For you a lot is really only an outlier if it experienced a processing different from the rest of your lots.  Data shouldn't be thrown away just b/c it makes your "fit bad" or it "looks high" or it makes my "P-value significant".