cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
hogi
Level XII

Fast Way to calculate "nested" Col ... aggregations - like MAD

In 2017 @Kevin_Anderson submitted a wish to add a functionality to calculate Robust Means and Standard Deviation functions in JSL and Formula Editor . 

 

In 2022 an extended version of the wish was submitted by @jthi :
Add normalization and robust statistical functions (and matrix functions) 
with the comment:


@jthi wrote:

Most of these could already be implemented fairly easily by using existing statistical functions, but I would much rather have them as normal functionality in JMP (native implementation could also be faster?). In my opinion, these are very useful and powerful functions (like are all other Statistical Col functions).

 

 

Another function to determine a robust estimator for the variation of values:
MAD: Median Absolute Deviation.
https://en.wikipedia.org/wiki/Median_absolute_deviation 

hogi_0-1751225253705.png

which can be determined via (*):
- calculate the Median() of the values

- calculate the difference between the values and the median

- remove the signs

- calculate the median

 

If you want to calculate MAD for a column, you can use the Tables/summary platform:

hogi_1-1751225391676.png

 

Often, users want to calculate MAD for individual groups.
But unfortunately, there is no Col MAD (values, grouping) function in JSL - not yet.


On the other hand, it's very easy to follow the steps in (*) and calculate MAD "manually" via JSL:

Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );

t0=hptime();

New Column( "MAD",
	Formula( Col Median( Abs( :height - Col Median( :height, :sex ) ), :sex ) )
);

dt << run formulas();

show((hptime()-t0)/1000000)

But wait - the Col Median inside the Col Median looks frightening - what if JMP calculates the values again and again with every row()?


Let's check if this is the case - by making the data table larger (6.4Mio rows):

Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );

For Each( {i}, {1, 2},
	dts = Repeat( {dt}, 400 );
	dt = dt << Concatenate( dts, );
);

With the larger data table, it takes several minutes to do the calculation (JMP18.2.1).

An easy trick:
Split the calculation into 2 calculations: 
- first calculate a column with the grouped median values

- then calculate MAD

another trick:
- wait for the next version of JMP ?

 

are there more tricks?

1 REPLY 1
hogi
Level XII

Re: Fast Way to calculate "nested" Col ... aggregations - like MAD


@hogi wrote:


another trick:
- wait for the next version of JMP ?


I just tried it with the EA version of JMP19 - A significant improvement !
-> just a few seconds instead of several minutes
wonderful : )

Recommended Articles