Discussions

hogi · Jun 29, 2025 1:09 PM

In 2017 @Kevin_Anderson submitted a wish to add a functionality to calculate Robust Means and Standard Deviation functions in JSL and Formula Editor .

In 2022 an extended version of the wish was submitted by @jthi :
Add normalization and robust statistical functions (and matrix functions)
with the comment:

@jthi wrote:
Most of these could already be implemented fairly easily by using existing statistical functions, but I would much rather have them as normal functionality in JMP (native implementation could also be faster?). In my opinion, these are very useful and powerful functions (like are all other Statistical Col functions).

Another function to determine a robust estimator for the variation of values:
MAD: Median Absolute Deviation.
https://en.wikipedia.org/wiki/Median_absolute_deviation

which can be determined via (*):
- calculate the Median() of the values

- calculate the difference between the values and the median

- remove the signs

- calculate the median

If you want to calculate MAD for a column, you can use the Tables/summary platform:

Often, users want to calculate MAD for individual groups.
But unfortunately, there is no Col MAD (values, grouping) function in JSL - not yet.

On the other hand, it's very easy to follow the steps in (*) and calculate MAD "manually" via JSL:

Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );

t0=hptime();

New Column( "MAD",
	Formula( Col Median( Abs( :height - Col Median( :height, :sex ) ), :sex ) )
);

dt << run formulas();

show((hptime()-t0)/1000000)

But wait - the Col Median inside the Col Median looks frightening - what if JMP calculates the values again and again with every row()?

Let's check if this is the case - by making the data table larger (6.4Mio rows):

Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );

For Each( {i}, {1, 2},
	dts = Repeat( {dt}, 400 );
	dt = dt << Concatenate( dts, );
);

With the larger data table, it takes several minutes to do the calculation (JMP18.2.1).

An easy trick:
Split the calculation into 2 calculations:
- first calculate a column with the grouped median values

- then calculate MAD

another trick:
- wait for the next version of JMP ?

are there more tricks?

hogi · Jul 18, 2025 2:24 PM

Thanks @Jasean for providing the solution::)

add Expr to protect the arguments of the function from being evaluated:

Add Custom Functions(
	New Custom Function(
		"COL",
		"MAD",
		Function( {value, byGroup},
			Col Median( Abs( value - Col Median( value, byGroup ) ), byGroup ) 
		),
		<<Formula Category( "Statistical" )
	)
);

New Column( "Col MAD",
	Formula( COL:MAD( Expr( :height ), Expr( :sex ) ) )
)

cool!

View solution in original post

hogi · Jul 8, 2025 02:25 AM

@hogi wrote:

another trick:
- wait for the next version of JMP ?

I just tried it with the EA version of JMP19 - A significant improvement !
-> just a few seconds instead of several minutes
wonderful : )

hogi · Jul 18, 2025 2:24 PM

Next step:
How to share the knowledge with colleagues - or via the Marketplace?

My idea:
use JSL to create a Custom Function

Add Custom Functions(
	New Custom Function(
		"COL",
		"MAD",
		Function( {value, byGroup},
			Col Median( Abs( value - Col Median( value, byGroup ) ), byGroup ) 
		),
		<<Formula Category( "Statistical" )
	)
);

and share the script as an AddIn ...

Then every user can use the new col aggregation via:

Unfortunately: This approach doesn't work : (
Does anybody find the error?

For debugging, you can use:

Names Default To Here( 1 );

Add Custom Functions(
	New Custom Function(
		"COL",
		"Median",
		Function( {value, byGroup},
			Col Median( value, byGroup )
		),
		<<Formula Category( "Statistical" )
	)
);

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

New Column( "my COL MEDIAN", Formula( COL:Median( :height, :sex ) ) );

on my system (JMP 18.2.0 & JMP 19 EA 9) , the result looks like this:

does it take the value of the first row?
- and use it for "F" , not having any value for "M"?

hogi · Jul 18, 2025 2:24 PM

Thanks @Jasean for providing the solution::)

add Expr to protect the arguments of the function from being evaluated:

Add Custom Functions(
	New Custom Function(
		"COL",
		"MAD",
		Function( {value, byGroup},
			Col Median( Abs( value - Col Median( value, byGroup ) ), byGroup ) 
		),
		<<Formula Category( "Statistical" )
	)
);

New Column( "Col MAD",
	Formula( COL:MAD( Expr( :height ), Expr( :sex ) ) )
)

cool!

Discussions

Fast Way to calculate "nested" Col ... aggregations - like MAD

Re: Fast Way to calculate "nested" Col ... aggregations - like MAD

Re: Fast Way to calculate "nested" Col ... aggregations - like MAD

Re: Fast Way to calculate "nested" Col ... aggregations - like MAD

Re: Fast Way to calculate "nested" Col ... aggregations - like MAD

Recommended Articles