cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
tomSorger
Level II

Reduce sorted list by averaging every n rows

This must be a common way to 'sample' long column of data. For example, after sorting a column of 10,000 rows from min to max, generate a column with 100 averages (or sums) of every 100 rows.

In Excel, this can be done with a formula using the OFFSET function. 

How can this be done in JMP? Thank you,

Tom

2 REPLIES 2
txnelson
Super User

Re: Reduce sorted list by averaging every n rows

One simple way of doing this is to create a new column which has a unique value for rows 1-100, 101-200 etc.  and then use the col mean() function to calculate separate means for each unique value created.  So interactively here is how to do it

  1. Create a new column and specify the following formula to provide a unique value for each grouping of 100 row. For this example I suggest that you call the new column "onehundreds" 
    Floor((Row() - 1) / 100)
  2. Create a second new column of whatever name you want using the following formula
    Col Mean(:<the column you want averaged>, :onehundreds)
  3. I also suggest that you then follow this with opening up the Column Info for each of the new columns and specifying to remove the formula.  This will convert the values in each of the columns to static values.  You can then 
    1. Delete the onehundreds column 
    2. Sort the data to whatever new order you want, and the calculated mean values will not change
Jim

Re: Reduce sorted list by averaging every n rows

Jim's steps are the way to go, and can be combined in a formula like this one, which avoids sort-dependency and also allows for table augmentation later:

 

Col Mean( :Vals, Floor( (Col Rank( :Vals ) - 1) / 100 ) )

Below, I've used this formula on integers 1::1000, randomly sorted. As shown, any number between 100*k+1 and 100*(k+1), inclusive, will have a group mean of  k + 50.5. In practice, you may be sorting (ranking) by another column, say a timestamp, which is fine.

 

Cheers,

Brady

 

brady_brady_1-1614268732653.png