cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
JC7
JC7
Level I

How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

Hi there,

 

I am looking to replicate what I do manually in JSL script.

I know how to run the "Explore Outliers" function in JSL but I still have to manually select the rows with outliers identified by the "Explore Outliers" then press 'select rows' then press 'exclude'.

I am trying to build a script which I run and automatically excludes the outliers identified without me having to do anything else (ideally does not even open up the Explore Outliers window.

 

Anyone got any ideas?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

@SDF1 the request is for Explore Outliers, not Explore Missing

 

@JC7 

Here is an example I worked up that uses the Water Treatment sample data table and Hides and Excludes all rows found for all columns to have been found as outliers.

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Water Treatment.jmp" );
el = Explore Outliers(
	Y(
		:Name( "Q-E" ),
		:Name( "ZN-E" ),
		:Name( "PH-E" ),
		:Name( "DBO-E" ),
		:Name( "DQO-E" ),
		:Name( "SS-E" ),
		:Name( "SSV-E" ),
		:Name( "SED-E" ),
		:Name( "COND-E" ),
		:Name( "PH-P" ),
		:Name( "DBO-P" ),
		:Name( "SS-P" ),
		:Name( "SSV-P" ),
		:Name( "SED-P" ),
		:Name( "COND-P" ),
		:Name( "PH-D" ),
		:Name( "DBO-D" ),
		:Name( "DQO-D" ),
		:Name( "SS-D" ),
		:Name( "SSV-D" ),
		:Name( "SED-D" ),
		:Name( "COND-D" ),
		:Name( "PH-S" ),
		:Name( "DBO-S" ),
		:Name( "DQO-S" ),
		:Name( "SS-S" ),
		:Name( "SSV-S" ),
		:Name( "SED-S" ),
		:Name( "COND-S" ),
		:Name( "RD-DBO-P" ),
		:Name( "RD-SS-P" ),
		:Name( "RD-SED-P" ),
		:Name( "RD-DBO-S" ),
		:Name( "RD-DQO-S" ),
		:Name( "RD-DBO-G" ),
		:Name( "RD-DQO-G" ),
		:Name( "RD-SS-G" ),
		:Name( "RD-SED-G" )
	),
	Quantile Range Outliers, 
	
);
dtQuant = el << Save Quantile Outlier Limits;
dtQuant << show window( 0 );
Report( el ) << close window;

// Get rid of non involved columns
dtQuant << select where( :Number of Outliers == 0 );
dtQuant << delete rows;

// Loop through the remaining rows and set the affected rows in the dt table to selected
dt << clear row states;
For( i = 1, i <= N Rows( dtQuant ), i++,
	dt << select where(
		As Column( dt, dtQuant:Column[i] ) <= dtQuant:Low Threshold[i] | As Column( dt, dtQuant:Column[i] )
		 >= dtQuant:High Threshold[i],
		current selection( "extend" )
	)
);

// Hide and exclude the rows
dt << hide and exclude;

// Close the no longer needed data table
Close( dtQuant, nosave );
Jim

View solution in original post

10 REPLIES 10
txnelson
Super User

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

Which outlier method are you selecting when you run Explore Outliers? It appears that different solutions may be required for the different methods.
Jim
JC7
JC7
Level I

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

I am using the quantile range outliers method - sorry forgot to say in the original post!
SDF1
Super User

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

Hi @JC7 ,

 

  Should be pretty easy following something like this:

 

Names Default To Here( 1 );

dt = Current Data Table();

colnames = dt << Get column Names();

obj = dt << Explore Missing Values( Y( Eval( colnames ) ) );

obj << select rows( Eval( colnames ) );

DS

txnelson
Super User

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

@SDF1 the request is for Explore Outliers, not Explore Missing

 

@JC7 

Here is an example I worked up that uses the Water Treatment sample data table and Hides and Excludes all rows found for all columns to have been found as outliers.

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Water Treatment.jmp" );
el = Explore Outliers(
	Y(
		:Name( "Q-E" ),
		:Name( "ZN-E" ),
		:Name( "PH-E" ),
		:Name( "DBO-E" ),
		:Name( "DQO-E" ),
		:Name( "SS-E" ),
		:Name( "SSV-E" ),
		:Name( "SED-E" ),
		:Name( "COND-E" ),
		:Name( "PH-P" ),
		:Name( "DBO-P" ),
		:Name( "SS-P" ),
		:Name( "SSV-P" ),
		:Name( "SED-P" ),
		:Name( "COND-P" ),
		:Name( "PH-D" ),
		:Name( "DBO-D" ),
		:Name( "DQO-D" ),
		:Name( "SS-D" ),
		:Name( "SSV-D" ),
		:Name( "SED-D" ),
		:Name( "COND-D" ),
		:Name( "PH-S" ),
		:Name( "DBO-S" ),
		:Name( "DQO-S" ),
		:Name( "SS-S" ),
		:Name( "SSV-S" ),
		:Name( "SED-S" ),
		:Name( "COND-S" ),
		:Name( "RD-DBO-P" ),
		:Name( "RD-SS-P" ),
		:Name( "RD-SED-P" ),
		:Name( "RD-DBO-S" ),
		:Name( "RD-DQO-S" ),
		:Name( "RD-DBO-G" ),
		:Name( "RD-DQO-G" ),
		:Name( "RD-SS-G" ),
		:Name( "RD-SED-G" )
	),
	Quantile Range Outliers, 
	
);
dtQuant = el << Save Quantile Outlier Limits;
dtQuant << show window( 0 );
Report( el ) << close window;

// Get rid of non involved columns
dtQuant << select where( :Number of Outliers == 0 );
dtQuant << delete rows;

// Loop through the remaining rows and set the affected rows in the dt table to selected
dt << clear row states;
For( i = 1, i <= N Rows( dtQuant ), i++,
	dt << select where(
		As Column( dt, dtQuant:Column[i] ) <= dtQuant:Low Threshold[i] | As Column( dt, dtQuant:Column[i] )
		 >= dtQuant:High Threshold[i],
		current selection( "extend" )
	)
);

// Hide and exclude the rows
dt << hide and exclude;

// Close the no longer needed data table
Close( dtQuant, nosave );
Jim
SDF1
Super User

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

Yes, you're right, sorry about that oversight on my part.
ian_jmp
Level X

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

In addition to @txnelson's reply, you might find something in this thread.

JC7
JC7
Level I

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

Thanks @txnelson for the solution.

I have tried to modify it so I can loop the entire code through a list of variables from a column so that I use the Quantile Range Outliers with an additional "where" function.

The loop seems to crash not being able to save quantile outlier limits saying object is not scriptable. Any ideas?
txnelson
Super User

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

Without seeing your code, I can only guess what the issue is.  It sounds like, you have added a column into your data table, that contains the columns you want to base you outliers analysis on.  If I am guessing correctly, below is a modification to my previous script, that has a new column added to the data table, which contains the columns to be analyzed.  The script reads the columns to be analyzed from the column, and then used those column names to limit the columns to be evaluated

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Water Treatment.jmp" );

// Create a new column with the names of the columns to be used
// in the analysis
dt << New Column( "theColumns",
	character,
	set values(
		{"Q-E", "ZN-E", "PH-E", "DBO-E", "DQO-E", "SS-E", "SSV-E", "SED-E", "COND-E", "PH-P", "DBO-P", "SS-P",
		"SSV-P", "SED-P", "COND-P", "PH-D", "DBO-D", "DQO-D", "SS-D", "SSV-D", "SED-D", "COND-D", "PH-S"}
	)
);

// Get the list of columns from the column that contains
// the columns to be analyzed
colNames = dt:theColumns << get values;
// Get rid of entries that have blank/null values
For( i = N Items( colNames ), i >= 1, i--,
	If( colNames[i] == "",
		Remove From( colNames, i, 1 )
	)
);

// Run the analysis using the list of column extracted from the
// "theColumns" column
el = Explore Outliers(
	Y( Eval( colNames ) ),
	Quantile Range Outliers, 
	
);
dtQuant = el << Save Quantile Outlier Limits;
dtQuant << show window( 0 );
Report( el ) << close window;

// Get rid of non involved columns
dtQuant << select where( :Number of Outliers == 0 );
dtQuant << delete rows;

// Loop through the remaining rows and set the affected rows in the dt table to selected
dt << clear row states;
For( i = 1, i <= N Rows( dtQuant ), i++,
	dt << select where(
		As Column( dt, dtQuant:Column[i] ) <= dtQuant:Low Threshold[i] | As Column( dt, dtQuant:Column[i] )
		 >= dtQuant:High Threshold[i],
		current selection( "extend" )
	)
);

// Hide and exclude the rows
dt << hide and exclude;

// Close the no longer needed data table
Close( dtQuant, nosave );
Jim
JC7
JC7
Level I

Re: How do I automatically exclude rows including outliers as identified by the "Explore Outliers" function?

I have broken the code down while I am trying to debug. It seems the issue is that on the second loop it is unable to save the quantile range outlier limits.

 

In the log it is saying "

This Where-clause for data table 'Quantile Outlier Limits' resulted in no rows: dt:Grade == grades[j]"

 

However, I know each grade on that list has limits and exists. I have checked it by starting at different values of J and it always works on the first loop then crashes on the 2nd loop. I assume it is something to do with the way I am storing the data in the el variable?

 

Names Default to here(1);
dt = current data table();

// Determine grades to loop through
dt1 = dt << summary(Group(:Grade));
dt1 << select where ( :Grade == "");
dt1 << delete rows;
dt1 << select where ( :Grade == "Check input");
dt1 << delete rows; 
grades = dt1:Grade << get values;
Close (dt1, nosave);

For (j = 3, j<= N items ( grades ), J++,

show(grades[j]);

	el = Explore Outliers(
		Y( dt:Hardness, dt:UTS, dt:M100, dt:Elongation, dt:LCR, dt:Prof, dt:SG ),
		Tail Quantile( 0.25 ),
		Show only columns with outliers( 1 ),
		Quantile Range Outliers,
		Where( dt:Grade == grades[j]);
	);
	show (el);
	dtQuant = el << Save Quantile Outlier Limits;
	Report(el)<< close window;
	
);