cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar

Add normalization and robust statistical functions (and matrix functions)

Add new Col statistical functions:

Also add as many of these as possible to Matrix operations if they are missing.

 

Most of these could already be implemented fairly easily by using existing statistical functions, but I would much rather have them as normal functionality in JMP (native implementation could also be faster?). In my opinion, these are very useful and powerful functions (like are all other Statistical Col functions).

Example functions (might include mistakes):

View more...
Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");

// add outlier for M
dt << Add Rows({name = "OUTLIER", age = 1, sex = "M", height = 500, weight = 500});

// add limits
dt << New Column("LSL", Numeric, Continuous, << Set Each Value(50));
dt << New Column("USL", Numeric, Continuous, << Set Each Value(60));

dt << New Column("ColMean_height", Numeric, Continuous, Formula(
	Col Mean(:height, :sex, Excluded())
));
dt << New Column("ColMedian_height", Numeric, Continuous, Formula(
	Col Median(:height, :sex, Excluded())
));

dt << New Column("ColStdDev_height", Numeric, Continuous, Formula(
	Col Std Dev(:height, :sex, Excluded())
));

dt << New Column("ColIQR_height", Numeric, Continuous, Formula(
	Col Quantile(:height, 0.75, :sex, Excluded()) - Col Quantile(:height, 0.25, :sex, Excluded())
));

dt << New Column("ColStandardize_height", Numeric, Continuous, Formula(
	Col Standardize(:height, :sex, Excluded())
));

// Example https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html
// using IQR here, but might be good idea to be able to change the quantiles (default to IQR)
dt << New Column("ColStandardizeRobust_height", Numeric, Continuous, Formula(
	(:height - Col Median(:height, :sex, Excluded())) / :ColIQR_height
));

// https://en.wikipedia.org/wiki/Feature_scaling#Rescaling_(min-max_normalization)
dt << New Column("ColNormalize_01", Numeric, Continuous, Formula(	
	0 + (:height - Col Min(:height, :sex, Excluded())) / (Col Max(:height, :sex, Excluded()) - Col Min(:height, :sex, Excluded()))
));

dt << New Column("ColNormalize_11", Numeric, Continuous, Formula(
	-1+(:height - Col Mean(:height, :sex, Excluded()))*(1-(-1)) / (Col Max(:height, :sex, Excluded()) - Col Min(:height, :sex, Excluded()))
));

dt << New Column("ColNormalize_limits", Numeric, Continuous, Formula(
	ColMin(:LSL, :sex)+(:height - Col Mean(:height, :sex, Excluded()))*(ColMax(:USL, :sex)-(ColMin(:LSL, :sex))) / (Col Max(:height, :sex, Excluded()) - Col Min(:height, :sex, Excluded()))
));

// maybe even Robust Sigma, divider would default to 1.35
// http://www.aecouncil.com/Documents/AEC_Q001_Rev_D.pdf 
// and robust limits, sigma multiplier defaulting to 6
5 Comments
SamGardner
Staff

Thank you for the list of specific examples.  We may take this under consideration for a future release.  

Status changed to: Acknowledged
 
Status changed to: Investigating
 
hogi
Level XI
hogi
Level XI

Another aggregation that is already available in the Table summary mneu, but not as a Col ... Formula is Median Absolute Deviation (MAD):

https://en.wikipedia.org/wiki/Median_absolute_deviation 

 

hogi_0-1671284149865.png