cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Neo
Neo
Level VI

Why is my column mean not matching with one calculated via the distribution platform, after excluding row(s)?

Calculating column mean before and after excluding a row is giving me same values via JSL. It is different as expected when obtained from the distribution platform. Where am I going wrong in JSL?

Names Default To Here( 1 );
Clear Log();
dt = Open( "$sample_data\Semiconductor Capability.jmp" ); // get data table
//get spec limits into column properties 
obj = dt << Manage Spec Limits( Y( dt << Get Column Group( "Processes" ) ), Show Limits All,  Save to Column Properties(1) ); 
obj << close window;  // close Manage Spec Limits window
colName  =  "M1_M1"; // choose column
colMean = Col Mean(dt:colName); show (colMean); // get column mean
colMin  = Col Minimum(dt:colName); show (colMin); // get column minimum value
//plot distribution
Distribution(Stack(1), Continuous Distribution(Column(:M1_M1 ), Horizontal Layout(1),Vertical(0), Process Capability(0)));
dt << select where( dt:colName == colMin)<< Hide and Exclude; // hide and exclude the minimim value of the column
colMeanNew  = Col Mean(dt:colName); show (colMeanNew); // get column mean again
//plot distribution again
Distribution(Stack(1), Continuous Distribution(Column(:M1_M1 ), Horizontal Layout(1),Vertical(0), Process Capability(0)));

Neo_0-1700783477443.pngNeo_1-1700783494869.png

Neo_2-1700783791381.png

 

 

When it's too good to be true, it's neither
2 ACCEPTED SOLUTIONS

Accepted Solutions
txnelson
Super User

Re: Why is my column mean not matching with one calculated via the distribution platform, after excluding row(s)?

Col Mean() does not honor the Excluded rows like the JMP Platforms do.  If you use:

colMeanNew = Col Mean( If( Excluded( Row State( Row() ) ), ., dt:colName ) );

it will give you the result you want.

Jim

View solution in original post

jthi
Super User

Re: Why is my column mean not matching with one calculated via the distribution platform, after excluding row(s)?

Col Functions and Row States is a good read regarding this. If you don't mind that the excluded rows will  be treated as their own group, you can also use

Col Mean(dt:colName, Excluded())

When using Col functions in this manner in your script (not in formula) and you have multiple groups (Excluded will have its own group), you have to be a bit careful. You can either modify the current row to be non-excluded onw (Row() = something) or create a formula with the Col function.

 

Other option would be to use Summarize()

summarize(dt, my_mean = Mean(dt:colName));
-Jarmo

View solution in original post

3 REPLIES 3
txnelson
Super User

Re: Why is my column mean not matching with one calculated via the distribution platform, after excluding row(s)?

Col Mean() does not honor the Excluded rows like the JMP Platforms do.  If you use:

colMeanNew = Col Mean( If( Excluded( Row State( Row() ) ), ., dt:colName ) );

it will give you the result you want.

Jim
jthi
Super User

Re: Why is my column mean not matching with one calculated via the distribution platform, after excluding row(s)?

Col Functions and Row States is a good read regarding this. If you don't mind that the excluded rows will  be treated as their own group, you can also use

Col Mean(dt:colName, Excluded())

When using Col functions in this manner in your script (not in formula) and you have multiple groups (Excluded will have its own group), you have to be a bit careful. You can either modify the current row to be non-excluded onw (Row() = something) or create a formula with the Col function.

 

Other option would be to use Summarize()

summarize(dt, my_mean = Mean(dt:colName));
-Jarmo
hogi
Level XI

Re: Why is my column mean not matching with one calculated via the distribution platform, after excluding row(s)?

Note that column formulas with excluded() are re-evaluated each time the row selection is changed:

Caution: Places where Jmp does something unexpected 

 

For small data tables, this is fine: for large data tables, this causes a noticeable delay.

This problem will be fixed in Jmp 18.