- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How do I use a formula to calculate mean for selected rows
Hi everyone,
To preface, I am basically a new user to JMP so I don't know how to do any scripting. Just basic formulas, etc. Here is what I need to do and can't figure out how.
I'd like to first calculate an average of the "Amount of Cyandin" for each of the same "Sample Number" (goal to have a column that calculates what I've placed in the "Average of three injections" column.
Similarly I'd like a column that gives the average of those three averages for each "Development Stage" which I've placed the goal in the column labeled "Average of developmental stages"
It's not a moving average... I thought maybe a conditional statement in a formula would work? Or perhaps a formula that uses some kind of third column like the "ID Code" column as an identifier to tell it which rows to average?
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use a formula to calculate mean for selected rows
The 2 formulas you need are
If( :Sample Number != :Sample Number[Row() + 1] | Row() == N Rows( Current Data Table() ),
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)
and
If( :Developmental Stage != :Developmental Stage[Row() + 1] | Row() == N Rows( Current Data Table() ),
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Developmental Stage )
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use a formula to calculate mean for selected rows
The 2 formulas you need are
If( :Sample Number != :Sample Number[Row() + 1] | Row() == N Rows( Current Data Table() ),
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)
and
If( :Developmental Stage != :Developmental Stage[Row() + 1] | Row() == N Rows( Current Data Table() ),
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Developmental Stage )
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use a formula to calculate mean for selected rows
Hello!
Thank you for the reply! I tried it and it works. I just don't quite understand how it works....
So far I think I get that
If the sample number doesn't equal the sample number of the next row
OR
the current row number is equal to the total rows in the current data table? (this part I'm confused on. Plus when I enter this, I get a warning about using the "Current Data Table" part.
THEN
Calculate the mean of the "Amount of Cyanidin (ng/kg) when the Sample Numbers are the same?
Could the workings of the formula be explained a bit to this beginner?
Followup question - for my full data table I will have obviously more data with rows that have a different cultivar name but then same sample numbers and injection numbers as shown in my example. Will these formulas still work? Will I need to account for the Cultivar name to separate them out?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use a formula to calculate mean for selected rows
The formula to do the calculations is actually very simple. It is your specification to have the results only displayed on specific rows that makes the calculations a bit messy.
When a formula is applied to a given column, JMP simply takes the formula and applies it to row 1, then row 2 and so on to the last row in the table. Therefore, the below formula will give you the answers you want
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
Basically the formula calculates the mean for the column "Amount of Cyanidin (ng/kg)" and write it out to the row for the column. Because there is also a By column specified, "Sample Number" it calculates the mean for each level found for the "Sample Number" and then writes it out to the row for the new column.
But your requirement was to only display the mean value on the last row for each of the "Sample Numbers". To identify which rows those are, there needs to be something that programmatically can be evaluated. In this case, I chose to check to see if the value of "Sample Number" changes in the row after the current row being processed. So that changes the formula to
If( :Sample Number != :Sample Number[Row() + 1] ,
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)
The If() clause is interpreted as: if the current value of "Sample Number" is not equal to the value of "Sample Number" in the row 1 row beyond the current row( Row() + 1 ), then do the calculation. If this comparison is not true, then it does nothing.
So this works great except for the last row in the data table. When JMP attempts to look at a row 1 row beyond the last row in the data table, it has no row to compare with, so it can not make a decision.
Thus, the formula needs to take into account this issue too.
If( :Sample Number != :Sample Number[Row() + 1] | Row() == N Rows( Current Data Table() ),
Col Mean( :"Amount of Cyanidin (ng/kg)"n, :Sample Number )
)
Here the additional clause is checking to see if the current row ,Row() , is equal to the number of rows in the data table, N Rows( Current data Table(). The second clause is separated by "|" which indicates that if either the first clause "OR" the second clause is true then process the calculation.
The second formula column follows the same logic with just the By column being changed to "Developmental Stage".
Concerning what needs to change to be able to use this on your full data table. I do not know what you are referring to when you say, "cultivar". The requirements for the formulas to work properly are for the data to be sorted in order of the "Developmental Stage" and "Sample Number" and that "Amount of Cyanidin (ng/kg)" is the name of the column that the means are calculated on. If any of those columns have a change in name, then the new names need to be swapped out in the formulae.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use a formula to calculate mean for selected rows
Thank you very much. I think I've figured it out now thanks to your explanations!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use a formula to calculate mean for selected rows
Other option is to use If with Col Min and Col Max functions, but be careful if you were to use this as this might break in any of the future JMP versions due to usage of Row() with Col Max/Col Min (Make using formula expressions in Set Each Value and using expressions as first argument in Col stat... )
ColAvg1
If(Row() == Col Max(Row(), :Sample Number),
Col Mean(:"Amount of Cyanidin (ng/kg)"n, :Sample Number)
)
ColAvg2
If(Row() == Col Max(Row(), :Developmental Stage),
Col Mean(:CoAvg1, :Developmental Stage)
)