Hi
I have a time series for which the variable has missing values. I would like to replace the missing values by a linear model derived from the closest available values in time.
What would be the best way?
And if the script could do that to all the columns of my table (except the time), that would be even better
Thanks
Example
What I have
t | Output |
---|---|
0 | 10 |
1 | . |
2 | . |
4 | 30 |
10 | . |
20 | 10 |
What I want:
t | Output |
---|---|
0 | 10 |
1 | 15 |
2 | 20 |
4 | 30 |
10 | 22.5 |
20 | 10 |
Thanks Ron but I don’t see anything in this that does interpolation?
here is a script that will do the task
Names Default To Here( 1 );
dt = Current Data Table();
For( rows = 1, rows <= N Rows( dt ), rows++,
If( Rows == 1, counter = 0 );
If( Is Missing( dt:output[rows] ) == 1,
counter = counter + 1,
If( counter > 0,
incr = (dt:output[Rows] - dt:output[Rows - counter - 1]) / (counter + 1);
For( i = Rows - counter, i <= Rows - 1, i++,
dt:output = dt:output[i - 1] + incr
);
counter = 0;
)
);
);
I had a similar problem some years ago and wrote a script to do imputation by "local regression".
Here's an updated version:
dt = Current Data Table();
n = N Rows(dt);
x = Column(1);// Column("t")
For(i = 2, i <= N Col(dt), i++,
y = Column(i);
For Each Row(
If(1 < Row() < n & Is Missing(y[]),
// find previous non-missing row
Eval(Eval Expr(r1 = Max(dt << get rows where(!Is Missing(y[]) & Row() < Expr(Row())))));
// find next non-missing row
Eval(Eval Expr(r2 = Min(dt << get rows where(!Is Missing(y[]) & Row() > Expr(Row())))));
y[] = y[r1] + (y[r2] - y[r1]) / (x[r2] - x[r1]) * (x[] - x[r1]);
)
)
);
Spot on MS,
Cheers!