cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
terapin
Level VI

Moving Average with Missing Data

I'm trying to calculate a moving average of some data that has some holes in it.  I can't seem to figure out how to caculate a moving average that only includes non missing data and that only calculates the result for rows with non missing data.  Any suggestions would be appreciated.

 

New Table( "Untitled 5",
	Add Rows( 30 ),
	Compress File When Saved( 1 ),
	New Table Variable( "Moving Avg Window", 5 ),
	New Column( "DateTime",
		Numeric,
		"Continuous",
		Format( "m/d/y h:m", 19 ),
		Input Format( "m/d/y h:m" ),
		Set Values(
			[3225888000, 3225888900, 3225889800, 3225890700, 3225891600, 3225892500, 3225893400, 3225894300,
			3225895200, 3225896100, 3225897000, 3225897900, 3225898800, 3225899700, 3225900600, 3225901500,
			3225902400, 3225903300, 3225904200, 3225905100, 3225906000, 3225906900, 3225907800, 3225908700,
			3225909600, 3225910500, 3225911400, 3225912300, 3225913200, 3225914100]
		)
	),
	New Column( "data",
		Numeric,
		"Continuous",
		Format( "Best", Use thousands separator( 0 ), 15 ),
		Set Values( [5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5, 5, 5] )
	),
	New Column( "expected run avg",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [., ., ., ., 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5, 5, 5] )
	),
	New Column( "run avg",
		Numeric,
		"Continuous",
		Format( "Fixed Dec", 12, 1 ),
		Formula(
			If( Row() < :Moving Avg Window,
				.,
				Sum( :data[Index( Row() - (:Moving Avg Window - 1), Row() )] ) / :Moving Avg Window
			)
		)
	)
);
1 ACCEPTED SOLUTION

Accepted Solutions
ms
Super User (Alumni) ms
Super User (Alumni)

Re: Moving Average with Missing Data

The Mean() function neglects any missing values. The below formula stick to the time window but do not include missing values when calculating the mean.

If(Row() < :Moving Avg Window,
    Mean(:data[Index(1, Row())]),
    Mean(:data[Index(Row() - (:Moving Avg Window - 1), Row())])
);

 

 

View solution in original post

6 REPLIES 6
txnelson
Super User

Re: Moving Average with Missing Data

Here is a script that I believe will give you what you want

Names Default To Here( 1 );
dt = New Table( "Untitled 5",
	Add Rows( 30 ),
	Compress File When Saved( 1 ),
	New Table Variable( "Moving Avg Window", 5 ),
	New Column( "DateTime",
		Numeric,
		"Continuous",
		Format( "m/d/y h:m", 19 ),
		Input Format( "m/d/y h:m" ),
		Set Values(
			[3225888000, 3225888900, 3225889800, 3225890700, 3225891600, 3225892500, 3225893400,
			3225894300, 3225895200, 3225896100, 3225897000, 3225897900, 3225898800, 3225899700,
			3225900600, 3225901500, 3225902400, 3225903300, 3225904200, 3225905100, 3225906000,
			3225906900, 3225907800, 3225908700, 3225909600, 3225910500, 3225911400, 3225912300,
			3225913200, 3225914100]
		)
	),
	New Column( "data",
		Numeric,
		"Continuous",
		Format( "Best", Use thousands separator( 0 ), 15 ),
		Set Values(
			[5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5,
			5, 5]
		)
	),
	New Column( "expected run avg",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values(
			[., ., ., ., 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5,
			5, 5]
		)
	),
	New Column( "run avg",
		Numeric,
		"Continuous",
		Format( "Fixed Dec", 12, 1 ),
		Formula(
			If( Row() == 1,
				NonMissingRows = dt << get rows where( Is Missing( :data ) == 0 )
			);
			If( Loc( As List( NonMissingRows ), Row() ) < :Moving Avg Window,
				.,
				TargetPosition = Loc( As List( NonMissingRows ), Row() );
				MatrixofMAW = NonMissingRows[Index(
					TargetPosition - (Moving Avg Window - 1),
					TargetPosition
				)];
				Show( Row(), MatrixofMAW );
				Sum( :data[MatrixofMAW] ) / :Moving Avg Window;
			);
		)
	)
);
Jim
mpl34
Level III

Re: Moving Average with Missing Data

If you can't create a subset excluding the null values I think the formula below works--though I'm sure there is a better way. This gives a null value for rows without data but you can have it repeat the average by getting rid of the if statement.

 

If(Is Missing(:data), ., rsum = 0; rcount = 0; temprow = 0; (While(rcount < :Moving Avg Window, While(Is Missing(:data[Row() - temprow]) & Row() - temprow >= 1, temprow += 1); rsum += :data[Row() - temprow]; temprow += 1; rcount += 1); rsum / :Moving Avg Window))

 

 

ms
Super User (Alumni) ms
Super User (Alumni)

Re: Moving Average with Missing Data

The Mean() function neglects any missing values. The below formula stick to the time window but do not include missing values when calculating the mean.

If(Row() < :Moving Avg Window,
    Mean(:data[Index(1, Row())]),
    Mean(:data[Index(Row() - (:Moving Avg Window - 1), Row())])
);

 

 

vince_faller
Super User (Alumni)

Re: Moving Average with Missing Data

In the below link we talked about this and put a couple of faster algorithms for it. 

 

https://community.jmp.com/t5/Discussions/Efficient-moving-average-algorithm/td-p/8553

 

Vince Faller - Predictum
terapin
Level VI

Re: Moving Average with Missing Data

Thanks everyone for weighing in on this question.  MS' suggestion is certainly the cleanest and easiest for me to implement since it let's me define and specify backward and forward moving window size.  I certainly learned a lot from all the examples provided.  Thanks!

ian_jmp
Staff

Re: Moving Average with Missing Data

I didn't look at this case in detail. But I just wanted to also mention:

 

Col Moving Average(name, options, <By var, ...>)

 

which seems to have a lot of flexibility.