I am trying to find an efficient way to calculate Rolling 3-day CUMULATIVE SUM.
I have some for loops that work, but the data sets I am working with are ~500k rows with many more GROUPS, SUBGROUPS and VALUE columns so it takes ~30 minutes to calculate. The example data set is very stripped down and basic.
I moved the values into a matrix and that speeds it up a little, but not enough.
There is a GROUP and SUBGROUP columns that are to be CUMULATIVE SUM seperately from eachother. Also for each row the CUMULATIVESUM can only be for DATE - 3 days. Anything before :DATE[]-3 days, and after :DATE[] is not used for that row's CUMULATIVE SUM.
I have a feeling a column formula using LAG will probably be fastest, but I cannot get it to work properly.
Speed is very important.
I have tried subsetting the tables, and that just made it slower.
See table and JSL. The top two JSL methods work perfectly, the third JSL method doesn't work properly.
Thank You!
[edit] JMP Pro 12.2.0
dt = Current Data Table();
For( i = 1, i <= N Rows( dt ), i = i + 1,
:CUMULATIVE_SUM_VALUE[i] = :VALUE[i];
For( c = i - 1, c > 0, c = c - 1,
If( :SUBGROUP[c] == :SUBGROUP[i] & :GROUP[c] == :GROUP[i] & (:DATE[i] - :DATE[c]) <= In Days( 3 ),
:CUMULATIVE_SUM_VALUE[i] = :CUMULATIVE_SUM_VALUE[i] + :VALUE[c],
c = -1
)
);
);
//////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////
//1=GROUP, 2=SUBGROUP, 3=DATE, 4=VALUE, 5=CUMULATIVE_SUM_VALUE
TT = dt << Get All Columns As Matrix;
For( i = 1, i <= N Rows( TT ), i = i + 1,
TT[i, 5] = TT[i, 4];
For( c = i - 1, c > 0, c = c - 1,
If( TT[c, 2] == TT[i, 2] & TT[c, 1] == TT[i, 1] & (TT[i, 3] - TT[c, 3]) <= In Days( 3 ),
TT[i, 5] = TT[i, 5] + TT[c, 4],
c = -1
)
);
);
For Each Row( :CUMULATIVE_SUM_VALUE = TT[Row(), 5] );
//////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////
dt << New Column( "CUMULATIVE_SUM_VALUE_2",
Formula(
//:VALUE;
If(
Row() == 1, :VALUE,
:GROUP == Lag( :GROUP, 1 ) & :SUBGROUP == Lag( :SUBGROUP, 1 ) & :DATE - Lag( :DATE, 1 ) <= In Days( 3 ), :VALUE + Lag( :CUMULATIVE_SUM_VALUE, 1 )
,
:GROUP != Lag( :GROUP, 1 ) & :SUBGROUP != Lag( :SUBGROUP, 1 ) & :DATE - Lag( :DATE, 1 ) <= In Days( 3 ), :VALUE
,
:GROUP == Lag( :GROUP, 1 ) & :SUBGROUP == Lag( :SUBGROUP, 1 ) & :DATE - Lag( :DATE, 1 ) > In Days( 3 ), :VALUE
,
:GROUP != Lag( :GROUP, 1 ) | :SUBGROUP != Lag( :SUBGROUP, 1 ), :VALUE
)
),
Eval Formula
);