- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Moving Average with Missing Data
I'm trying to calculate a moving average of some data that has some holes in it. I can't seem to figure out how to caculate a moving average that only includes non missing data and that only calculates the result for rows with non missing data. Any suggestions would be appreciated.
New Table( "Untitled 5",
Add Rows( 30 ),
Compress File When Saved( 1 ),
New Table Variable( "Moving Avg Window", 5 ),
New Column( "DateTime",
Numeric,
"Continuous",
Format( "m/d/y h:m", 19 ),
Input Format( "m/d/y h:m" ),
Set Values(
[3225888000, 3225888900, 3225889800, 3225890700, 3225891600, 3225892500, 3225893400, 3225894300,
3225895200, 3225896100, 3225897000, 3225897900, 3225898800, 3225899700, 3225900600, 3225901500,
3225902400, 3225903300, 3225904200, 3225905100, 3225906000, 3225906900, 3225907800, 3225908700,
3225909600, 3225910500, 3225911400, 3225912300, 3225913200, 3225914100]
)
),
New Column( "data",
Numeric,
"Continuous",
Format( "Best", Use thousands separator( 0 ), 15 ),
Set Values( [5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5, 5, 5] )
),
New Column( "expected run avg",
Numeric,
"Continuous",
Format( "Best", 12 ),
Set Values( [., ., ., ., 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5, 5, 5] )
),
New Column( "run avg",
Numeric,
"Continuous",
Format( "Fixed Dec", 12, 1 ),
Formula(
If( Row() < :Moving Avg Window,
.,
Sum( :data[Index( Row() - (:Moving Avg Window - 1), Row() )] ) / :Moving Avg Window
)
)
)
);
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
The Mean() function neglects any missing values. The below formula stick to the time window but do not include missing values when calculating the mean.
If(Row() < :Moving Avg Window,
Mean(:data[Index(1, Row())]),
Mean(:data[Index(Row() - (:Moving Avg Window - 1), Row())])
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
Here is a script that I believe will give you what you want
Names Default To Here( 1 );
dt = New Table( "Untitled 5",
Add Rows( 30 ),
Compress File When Saved( 1 ),
New Table Variable( "Moving Avg Window", 5 ),
New Column( "DateTime",
Numeric,
"Continuous",
Format( "m/d/y h:m", 19 ),
Input Format( "m/d/y h:m" ),
Set Values(
[3225888000, 3225888900, 3225889800, 3225890700, 3225891600, 3225892500, 3225893400,
3225894300, 3225895200, 3225896100, 3225897000, 3225897900, 3225898800, 3225899700,
3225900600, 3225901500, 3225902400, 3225903300, 3225904200, 3225905100, 3225906000,
3225906900, 3225907800, 3225908700, 3225909600, 3225910500, 3225911400, 3225912300,
3225913200, 3225914100]
)
),
New Column( "data",
Numeric,
"Continuous",
Format( "Best", Use thousands separator( 0 ), 15 ),
Set Values(
[5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5,
5, 5]
)
),
New Column( "expected run avg",
Numeric,
"Continuous",
Format( "Best", 12 ),
Set Values(
[., ., ., ., 5, ., 5, 5, 5, 5, 5, 5, 5, 5, ., 5, 5, 5, 5, 5, 5, 5, ., ., 5, 5, 5, 5,
5, 5]
)
),
New Column( "run avg",
Numeric,
"Continuous",
Format( "Fixed Dec", 12, 1 ),
Formula(
If( Row() == 1,
NonMissingRows = dt << get rows where( Is Missing( :data ) == 0 )
);
If( Loc( As List( NonMissingRows ), Row() ) < :Moving Avg Window,
.,
TargetPosition = Loc( As List( NonMissingRows ), Row() );
MatrixofMAW = NonMissingRows[Index(
TargetPosition - (Moving Avg Window - 1),
TargetPosition
)];
Show( Row(), MatrixofMAW );
Sum( :data[MatrixofMAW] ) / :Moving Avg Window;
);
)
)
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
If you can't create a subset excluding the null values I think the formula below works--though I'm sure there is a better way. This gives a null value for rows without data but you can have it repeat the average by getting rid of the if statement.
If(Is Missing(:data), ., rsum = 0; rcount = 0; temprow = 0; (While(rcount < :Moving Avg Window, While(Is Missing(:data[Row() - temprow]) & Row() - temprow >= 1, temprow += 1); rsum += :data[Row() - temprow]; temprow += 1; rcount += 1); rsum / :Moving Avg Window))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
The Mean() function neglects any missing values. The below formula stick to the time window but do not include missing values when calculating the mean.
If(Row() < :Moving Avg Window,
Mean(:data[Index(1, Row())]),
Mean(:data[Index(Row() - (:Moving Avg Window - 1), Row())])
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
In the below link we talked about this and put a couple of faster algorithms for it.
https://community.jmp.com/t5/Discussions/Efficient-moving-average-algorithm/td-p/8553
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
Thanks everyone for weighing in on this question. MS' suggestion is certainly the cleanest and easiest for me to implement since it let's me define and specify backward and forward moving window size. I certainly learned a lot from all the examples provided. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Moving Average with Missing Data
I didn't look at this case in detail. But I just wanted to also mention:
Col Moving Average(name, options, <By var, ...>)
which seems to have a lot of flexibility.