BookmarkSubscribe
Choose Language Hide Translation Bar
Highlighted

How to calculate moving average of column with 10 million entries ?

I am trying to run the below formula on a column with 10 million rows to calculate the moving average , but the script keeps running forever and jmp stops responding: thisTable << New Column( "av", Numeric, Continuous, Formula( Mean( :x[Index( Row() - 99999, Row() )] ) ) ); Is there any limitation or something wrong with the formula. How can I calculate the moving average in this case?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to calculate moving average of column with 10 million entries ?

You are correct, the formula was in error.  I was not setting the value of "theSum" correctly.  The following should get correct results:

If( Row() == 1,
theLag = 3;
theSum = 0;
);
If( Row() <= theLag,
theSum = theSum + :yield;
theAvg = theSum / Row();
,
theAvg = ((theSum - :yield[Row() - theLag]) + :yield) / theLag;
theSum = (theSum - :yield[Row() - theLag]) + :yield;
);
theAvg;
Jim
8 REPLIES 8

Re: How to calculate moving average of column with 10 million entries ?

Try this and see if it runs faster

Col Moving Average( :x, 1, 999999, 0 )

Look into the Scripting Index for documentation and example

Or, you might try this, which eliminates the requirement to calculate the mean for a list of 100000 rows for each row's calculation, and cut it down to a single subtraction, a single addition, and one division.

If( Row() == 1,
theLag = 100000;
theSum = Sum( :x[Index( 1, theLag )] );
);
If(
Row() < theLag, theAvg = Col Moving Average( :x, 1, theLag, 0 ),
Row() == theLag, theAvg = theSum / theLag,
theAvg = ((theSum - :x[Row() - theLag]) + :x) / theLag
);
theAvg;
Jim

Re: How to calculate moving average of column with 10 million entries ?

this script does not work on JMP12 (Col Moving Average not supported on JMP12)

Re: How to calculate moving average of column with 10 million entries ?

The Col Moving Average() function can easily be taken out of the formula.  Taking the approach provided in the second formula in my previous post, the calculation of the moving average by summing the data and then dividing, will give you what you want.  Please make sure you understand how the code that is provided to you works.  Here is a new modification of the formula:

If( Row() == 1,
theLag = 100000;
theSum = 0;
);
If(
Row() <= theLag,
theSum = theSum + :x;
theAvg = theSum/Row(),
theAvg = ((theSum - :x[Row() - theLag]) + :x) / theLag
);
theAvg;
Jim

Re: How to calculate moving average of column with 10 million entries ?

I tried the suggested solution but it does not give the correct answer, couldyu please help

Re: How to calculate moving average of column with 10 million entries ?

You are correct, the formula was in error.  I was not setting the value of "theSum" correctly.  The following should get correct results:

If( Row() == 1,
theLag = 3;
theSum = 0;
);
If( Row() <= theLag,
theSum = theSum + :yield;
theAvg = theSum / Row();
,
theAvg = ((theSum - :yield[Row() - theLag]) + :yield) / theLag;
theSum = (theSum - :yield[Row() - theLag]) + :yield;
);
theAvg;
Jim

Re: How to calculate moving average of column with 10 million entries ?

yes, now it doesnt throw any error but I have to check if it will work for 10 million entries

Re: How to calculate moving average of column with 10 million entries ?

It should work for 10,000,000 rows.

Here is a faster version of the formula.....it removes the need for a second calculation of the Sum

If( Row() == 1,
theLag = 3;
theSum = 0;
);
If( Row() <= theLag,
theSum = theSum + :yield;
theAvg = theSum / Row();
,
theSum = (theSum - :yield[Row() - theLag]) + :yield;
theAvg = theSum  / theLag;

);
theAvg;
Jim

Re: How to calculate moving average of column with 10 million entries ?

For the point and click folks out there like me you can get a moving average on a column by right clicking the header of the column of interest and selecting "New Formula Column" > Row > Moving Average.  You will have to make another click or two to decide whether or not you want to use weighting, but it was pretty fast once I clicked OK for my simulated 10M row data table.

HTH

Bill