- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Formula for trimmed mean
I have a data set that has a sample number in the first column, and then 180 columns of various measurements of the sample. I want to get the 5% trimmed mean for each of 100 samples in the table.
Because of the large number of samples, going through Distributions > Summary Statistics is very impractical. Is there a way I can pull this directly using a formula?
Alternatively, I could stack my columns and use the Summary function if possible, but I can't see how to get a trimmed mean here either.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
I think using Distribution is fairly easy option. Create Distribution with measurement data as Y, Columns and sample number as By column (using Probe data), if you have lots of data this will be slow and might in worse case crash JMP, so scripting this might be a better option (you could script this same process, use matrix calculations or some other method).
After you have your Distributions, hold down ctrl + alt and click red triangle next to one of the distributions and disable everything else besides Summary Statistics (should make this a bit faster) and press OK:
After that. hold down ctrl + alt and press red triangle next to one of Summary Statistics.
Enable Customize Summary Statistics and press OK
Remove everything else besides Trimmed Mean:
and press OK.
Now we have this beautiful distribution report:
Right click on one of Trimmer means and choose Make combined data table:
You should end up with datatable with 5% Trimmed Means for all parameters and groups:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
Hi @HarriBradbeer ,
I'm sure this should be easy to do with either a column formula or a JSL script.
However, to help a little more, what do you mean by a 5% trimmed mean? Does this mean you cut out the bottom 5% and top 5% of the data?
Thanks,
DS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
Yes, the mean is taken from the middle 90% of data
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
I think using Distribution is fairly easy option. Create Distribution with measurement data as Y, Columns and sample number as By column (using Probe data), if you have lots of data this will be slow and might in worse case crash JMP, so scripting this might be a better option (you could script this same process, use matrix calculations or some other method).
After you have your Distributions, hold down ctrl + alt and click red triangle next to one of the distributions and disable everything else besides Summary Statistics (should make this a bit faster) and press OK:
After that. hold down ctrl + alt and press red triangle next to one of Summary Statistics.
Enable Customize Summary Statistics and press OK
Remove everything else besides Trimmed Mean:
and press OK.
Now we have this beautiful distribution report:
Right click on one of Trimmer means and choose Make combined data table:
You should end up with datatable with 5% Trimmed Means for all parameters and groups:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
Thanks @jthi, this has got me most of the way there.
Using this, I've managed to get a script that will go straight to the 'trimmed mean only' without computing everything else which makes the processing time more manageable.
Distribution(
Continuous Distribution(
Column( :MEASUREMENT VALUE ),
Quantiles( 0 ),
Histogram( 0 ),
Vertical( 0 ),
Outlier Box Plot( 0 ),
Customize Summary Statistics(
Mean( 0 ),
Std Dev( 0 ),
Std Err Mean( 0 ),
Upper Mean Confidence Interval( 0 ),
Lower Mean Confidence Interval( 0 ),
N( 0 ),
Trimmed Mean( 1 )
)
),
by(
:PRODUCT_ID,:PROCESS_ID, :MEASUREMENT_TYPE
)
)
Now all I'm looking for is a script I can add to the end of here to do the 'Make Combined Data Table' step. There is some further data manipulation I want to do once I've got the trimmed means so having it all in a single script would be ideal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
This should give you what you want
dis = Distribution(
invisible,
Continuous Distribution(
Column( :MEASUREMENT VALUE ),
Quantiles( 0 ),
Histogram( 0 ),
Vertical( 0 ),
Outlier Box Plot( 0 ),
Customize Summary Statistics(
Mean( 0 ),
Std Dev( 0 ),
Std Err Mean( 0 ),
Upper Mean Confidence Interval( 0 ),
Lower Mean Confidence Interval( 0 ),
N( 0 ),
Trimmed Mean( 1 )
)
),
by( :PRODUCT_ID, :PROCESS_ID, :MEASUREMENT_TYPE )
);
Report( dis[1] )["Summary Statistics"][Table Box( 1 )] << make combined data table;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
Exactly what I needed, thank you.
For some reason, JMP wasn't showing me the source script when I made the table manually. This worked perfectly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Formula for trimmed mean
JMP should be able to help you with that. Create the table manually and check the table script from result table: