cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
EmilyT
Level II

How do I use a formula to calculate mean for selected rows

Hi everyone,

 

To preface, I am basically a new user to JMP so I don't know how to do any scripting.  Just basic formulas, etc.  Here is what I need to do and can't figure out how. 

 

I'd like to first calculate an average of the "Amount of Cyandin" for each of the same "Sample Number" (goal to have a column that calculates what I've placed in the "Average of three injections" column. 

 

Similarly I'd like a column that gives the average of those three averages for each "Development Stage" which I've placed the goal in the column labeled "Average of developmental stages"

 

It's not a moving average... I thought maybe a conditional statement in a formula would work? Or perhaps a formula that uses some kind of third column like the "ID Code" column as an identifier to tell it which rows to average?

 

Thanks!

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: How do I use a formula to calculate mean for selected rows

The 2 formulas you need are

If( :Sample Number != :Sample Number[Row() + 1] | Row() == N Rows( Current Data Table() ),
	Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)

and

If( :Developmental Stage != :Developmental Stage[Row() + 1] | Row() == N Rows( Current Data Table() ),
	Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Developmental Stage )
)

txnelson_0-1700188615699.png

 

Jim

View solution in original post

5 REPLIES 5
txnelson
Super User

Re: How do I use a formula to calculate mean for selected rows

The 2 formulas you need are

If( :Sample Number != :Sample Number[Row() + 1] | Row() == N Rows( Current Data Table() ),
	Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)

and

If( :Developmental Stage != :Developmental Stage[Row() + 1] | Row() == N Rows( Current Data Table() ),
	Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Developmental Stage )
)

txnelson_0-1700188615699.png

 

Jim
EmilyT
Level II

Re: How do I use a formula to calculate mean for selected rows

Hello!

 

Thank you for the reply!  I tried it and it works.  I just don't quite understand how it works....

 

So far I think I get that

 

If the sample number doesn't equal the sample number of the next row

 

OR

 

the current row number is equal to the total rows in the current data table? (this part I'm confused on.  Plus when I enter this, I get a warning about using the "Current Data Table" part.

 

THEN

 

Calculate the mean  of the "Amount of Cyanidin (ng/kg) when the Sample Numbers are the same?

 

Could the workings of the formula be explained a bit to this beginner?

 

Followup question - for my full data table I will have obviously more data with rows that have a different cultivar name but then same sample numbers and injection numbers as shown in my example.  Will these formulas still work?  Will I need to account for the Cultivar name to separate them out?

 

Thank you

txnelson
Super User

Re: How do I use a formula to calculate mean for selected rows

The formula to do the calculations is actually very simple.  It is your specification to have the results only displayed on specific rows that makes the calculations a bit messy.

When a formula is applied to a given column, JMP simply takes the formula and applies it to row 1, then row 2 and so on to the last row in the table.  Therefore, the below formula will give you the answers you want


Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )

Basically the formula calculates the mean for the column "Amount of Cyanidin (ng/kg)" and write it out to the row for the column.  Because there is also a By column specified, "Sample Number" it calculates the mean for each level found for the "Sample Number" and then writes it out to the row for the new column.

txnelson_0-1700251759628.png

But your requirement was to only display the mean value on the last row for each of the "Sample Numbers".  To identify which rows those are, there needs to be something that programmatically can be evaluated.  In this case, I chose to check to see if the value of "Sample Number" changes in the row after the current row being processed.  So that changes the formula to

If( :Sample Number != :Sample Number[Row() + 1] ,
	Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)

The If() clause is interpreted as:  if the current value of "Sample Number" is not equal to the value of "Sample Number" in the row 1 row beyond the current row( Row() + 1 ), then do the calculation.  If this comparison is not true, then it does nothing.

So this works great except for the last row in the data table.  When JMP attempts to look at a row 1 row beyond the last row in the data table, it has no row to compare with, so it can not make a decision.

txnelson_2-1700253886862.png

 

Thus, the formula needs to take into account this issue too.

If( :Sample Number != :Sample Number[Row() + 1] | Row() == N Rows( Current Data Table() ),
	Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)

Here the additional clause is checking to see if the current row ,Row() , is equal to the number of rows in the data table, N Rows( Current data Table().  The second clause is separated by "|" which indicates that if either the first clause "OR" the second clause is true then process the calculation.

The second formula column follows the same logic with just the By column being changed to "Developmental Stage".

 

Concerning what needs to change to be able to use this on your full data table.  I do not know what you are referring to when you say, "cultivar".   The requirements for the formulas to work properly are for the data to be sorted in order of the "Developmental Stage" and "Sample Number" and that "Amount of Cyanidin (ng/kg)" is the name of the column that the means are calculated on.  If any of those columns have a change in name, then the new names need to be swapped out in the formulae. 

 

Jim
EmilyT
Level II

Re: How do I use a formula to calculate mean for selected rows

Thank you very much.  I think I've figured it out now thanks to your explanations!

jthi
Super User

Re: How do I use a formula to calculate mean for selected rows

Other option is to use If with Col Min and Col Max functions, but be careful if you were to use this as this might break in any of the future JMP versions due to usage of Row() with Col Max/Col Min (Make using formula expressions in Set Each Value and using expressions as first argument in Col stat... )

 

ColAvg1

 

If(Row() == Col Max(Row(), :Sample Number),
	Col Mean(:"Amount of Cyanidin (ng/kg)"n, :Sample Number)
)

ColAvg2

If(Row() == Col Max(Row(), :Developmental Stage),
	Col Mean(:CoAvg1, :Developmental Stage)
)

jthi_0-1700376384183.png

 

 

-Jarmo