Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Re: Diving into Explore Outliers

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 14, 2019 6:42 AM
(1322 views)

I just stumbled on the Explore Outliers platform and I am very giddy to say the least. My question lies in the Exclude Rows under the Robust Fit Outliers. I am actually just trying to see how the script works inside this platform specifically how it determines which rows to Exclude. I have up to 200+ columns after i split the table with the appropriate screening limits.

I can do Save Script to Script Window but i just appears to show me how to write the script in a macro form and not actually the row functions.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I've not attempted to script 'Explore Outliers' before, so there may be a better way - The code below worked on the only case I had time to try, so it might get you started:

```
NamesDefaultToHere(1);
// Example data
dt1 = Open( "$SAMPLE_DATA/Probe.jmp" );
// Copy the data to a new table
dt2 = Eval(dt1 << getScript);
dt2 << setName((dt1 << getName||" Screened"));
// List of columns to screen
colsToScreen = {:VDP_M1, :VDP_M2, :VDP_NBASE};
// Screen for outliers using your favourite method
eo = dt2 << Explore Outliers(Y(Eval(colsToScreen)), Quantile Range Outliers( 1 ), Show Only Columns With Outliers(1), Invisible);
// Using the report, find the columns that have outliers
eoRep = Report(eo);
table = eoRep[TableBox(1)];
colList = eoRep[StringColBox(1)];
// Loop over these columns . . .
nCols = NItems(colList << get);
for(c=1, c<=nCols, c++,
// Select this column (described by a row)
CMD = Expr( table << setSelectedRows({colTBD}) );
SubstituteInto(CMD, Expr(colTBD), Eval(c));
CMD;
// Update dt2 for this column: Cells that were considered outliers are coloured red
eo << ColorCells(1);
eo << ChangeToMissing(1);
);
eoRep << closeWindow;
```

Rather than set cells to missing, you could consider using missing value codes.

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Diving into Explore Outliers

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Diving into Explore Outliers

Thanks for the reply. I found that link you were referring to and I am working through all the information to determine what is the best method to use for my data.

I apologize for not being more specific in my first post. What i would like to do is automate the Explore Outliers platform into a multitude of ploting functions i am creating. For example, load my data table, split the column and perform the Explore Outlier which obviously is able to select the rows that are conidered an outlier and then exclude it. I noticed that ones a row has been excluded, an error pops up. I would need to somehow be able to ignore it in order to move on to the next column. I have attached a data table and included a JSL of how I logically see the script run. It doesn't work as of yet and that is why am here. Some tests have 300+ Split By ( :SPEC_COL_NAMES ).

Thanks in advance.

----- Begin Loop

----- Run Explore Outliers on cols[1]

----- Exclude Row

----- Plot Distribution

----- Clear Row States

----- Run Explore Outliers on cols[2]

----- Exclude Row

----- Plot Distribution

----- Clear Row States

----- End Loop after cols[i]

```
dt = Current Data Table();
// Split Data Table by SPEC_COL_NAMES :
// This is used for a variety of things, specifically Spec Limits and Range Checks
dtsplitmeas = dt << Split(
Invisible,
Split By( :SPEC_COL_NAMES ),
Split( :Output_1 ),
Group( :wafer_number, :rownum, :colnum, :subrow, :subcol, :RowCol ),
Remaining Columns( Drop All )
);
cols = dtsplitmeas << Get Column Names( Numeric );
dtsum = dt << Summary(
Invisible,
Group( :wafer_number )
);
jjrn1 = New Window( "Distribution - Output_1 ", << Journal );
// I am taking a shot at the syntax and the way I think it should be coded.
// By all means, correct me if i am wrong and make any modifications you see fit
// I don't actually know how to address the different wafers #'s and the
// SPEC_COL_NAMES at the same time. This is why i figured 2 For() loops.
For( i = 1, i <= N Items( cols ), i++,
For( j = 1, j <= N Items( dtsum ), j++,
test = cols[i];
wfr = dtsum:wafer_number[j];
eo = Explore Outliers(
SendToByGroup( Bygroup Default ),
Y( As Column( test ) ),
Robust Fit Outliers,
Where( :wafer_number == wfr )
);
//------------------------------------------------------------ -/
// < Insert Script to Exclude Rows for col[1] in dtsplitmeas> /
// This is what my original question was /
// referring to /
//----------------------------------------------------------/
// Step into Control Chart, Control Chart Builder, Distributions, Fit Y by X plots
// Control Chart Builder is just an example
gb = Control Chart Builder(
Size( 847, 990 ),
Show Control Panel( 0 ),
Variables( Y( As Column( test ) ) ),
Chart( Position( 1 ), Limits( Sigma( "Levey Jennings" ) ) ),
Chart( Position( 2 ), Limits( Sigma( "Moving Range" ) ) )
);
// Create Control Chart for all SPEC_COL_NAMES place it into a report / journal
(gb << top Report)[Text Edit Box( 1 )] << delete; //delete the where statement
Report( gb )[Outline Box( 1 )] << Set Title( cols[i] );
jjrn2 << Append( Report( gb ) );
gb << Close Window;
// Clear Row States because when you exclude a row for a given column, it extends
// to all other colums on that row
dtsplitmeas << Clear Row States();
// Rise and Repeat for all SPEC_COL_NAMES
);
);
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I've not attempted to script 'Explore Outliers' before, so there may be a better way - The code below worked on the only case I had time to try, so it might get you started:

```
NamesDefaultToHere(1);
// Example data
dt1 = Open( "$SAMPLE_DATA/Probe.jmp" );
// Copy the data to a new table
dt2 = Eval(dt1 << getScript);
dt2 << setName((dt1 << getName||" Screened"));
// List of columns to screen
colsToScreen = {:VDP_M1, :VDP_M2, :VDP_NBASE};
// Screen for outliers using your favourite method
eo = dt2 << Explore Outliers(Y(Eval(colsToScreen)), Quantile Range Outliers( 1 ), Show Only Columns With Outliers(1), Invisible);
// Using the report, find the columns that have outliers
eoRep = Report(eo);
table = eoRep[TableBox(1)];
colList = eoRep[StringColBox(1)];
// Loop over these columns . . .
nCols = NItems(colList << get);
for(c=1, c<=nCols, c++,
// Select this column (described by a row)
CMD = Expr( table << setSelectedRows({colTBD}) );
SubstituteInto(CMD, Expr(colTBD), Eval(c));
CMD;
// Update dt2 for this column: Cells that were considered outliers are coloured red
eo << ColorCells(1);
eo << ChangeToMissing(1);
);
eoRep << closeWindow;
```

Rather than set cells to missing, you could consider using missing value codes.

Highlighted
##

I can't thank you enough for this... This is unbelievable! The outlier issue has been the bain of my existence (along with wafer map creation) since i was in charge of plotting all this data. Mr. txnelson enlightned me about the range check method which is good when i have spec limits to compare it too however just plotting raw data with where i can't use the check was hard for me..

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Diving into Explore Outliers