As you will understand from the question I am a newbie to JSL scripting/formulating/automating...I have a revenue data set which contains a column called as "year" of 2013 2014 2015... I need a Cumulative Sum for each year but when I create a column of Cumulative Sum (by the help of Graph Builder) it adds all the days and year one to another and ends up with grand 3 year total Cum Sum, where I would like the Cum Sum start over for each year. I could solve it by splitting or subsetting but since every day I need to send a report of revenue comparison it is really annoying. And this data changes every hour so that would be great to update the existing jmp with the added data and then create the Cum Sum Column for each year... Could someone help please. Thank you!
If you make sure that your data table is sorted by the grouping variable (year, in your case), then you should be able to use a formula column similar to this one (which uses the Big Class sample data table as an example):
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Sort( By( :sex ), Order( Ascending ), Replace Table );
dt << New Column( "CumSumBySex",
Set Formula( If( :sex != Lag( :sex ), :height, Lag( :CumSumBySex ) + :height ) )
Hope that helps,
Thank you very much for your help but somehow it does not work. I am okay with opening the data table and sorting out by Year and also the column gets created but without any data for 2014
Would you help with the formula which goes in the column?
Are you seeing missing values for the rows corresponding to year=2014? Do the rows for other years have values?
Would you be able to send a sample of your data? That might help identify the problem more easily.
Here is a proportion of the data. I checked the missing values, Column Year is zero.
As I substetted the main data table now the columns are all with values but this time 2014 is illogical an 2015 with true values
The difference is that my first example used a character variable for the grouping variable. Your example uses a numeric variable, which is fine. But it requires a little more handling in the formula. The reason for that is that for numeric values, the Lag() on the first row returns missing and comparing missing to the current row also returns missing. Then, in the 2nd row, a missing value from the first row is being added to the value in the 2nd row, which results in a missing value.
Here's an updated formula that I think should work for the numeric grouping variable case:
If( Row() == 1 | Lag( :Year ) != :Year,
Lag( :CumSumByYear ) + :NetRoomRevenueWithOutVat
Let me know if that works for you.
Thank you Michael, now both years work but whenever the formula faces a 0.00 value (there are days when there is no revenue) the CumBySumYear starts calculating from zero which ends irrational results.
That's strange. When I try putting a 0.00 value in :NetRoomRevenueWIthOutVat, the :CumSumByYear value is the same as the previous row's value:
Could you send the formula you are using for the CumSumByYear column?