Choose Language Hide Translation Bar
Highlighted

## Replacing missing data by regressed values

Hi

I have a time series for which the variable has missing values. I would like to replace the missing values by a linear model derived from the closest available values in time.

What would be the best way?

And if the script could do that to all the columns of my table (except the time), that would be even better

Thanks

Example

What I have

tOutput
010
1.
2.
430
10.
2010

What I want:

tOutput
010
115
220
430
1022.5
20

10

5 REPLIES 5

## Re: Replacing missing data by regressed values

try this for starters

## Re: Replacing missing data by regressed values

Thanks Ron but I don’t see anything in this that does interpolation?

## Re: Replacing missing data by regressed values

here is a script that will do the task

Names Default To Here( 1 );

dt = Current Data Table();

For( rows = 1, rows <= N Rows( dt ), rows++,

If( Rows == 1, counter = 0 );

If( Is Missing( dt:output[rows] ) == 1,

counter = counter + 1,

If( counter > 0,

incr = (dt:output[Rows] - dt:output[Rows - counter - 1]) / (counter + 1);

For( i = Rows - counter, i <= Rows - 1, i++,

dt:output = dt:output[i - 1] + incr

);

counter = 0;

)

);

);

Jim

## Re: Replacing missing data by regressed values

I had a similar problem some years ago and wrote a script to do imputation by "local regression".

Here's an updated version:

dt = Current Data Table();

n = N Rows(dt);

x = Column(1);// Column("t")

For(i = 2, i <= N Col(dt), i++,

y = Column(i);

For Each Row(

If(1 < Row() < n & Is Missing(y[]),

// find previous non-missing row

Eval(Eval Expr(r1 = Max(dt << get rows where(!Is Missing(y[]) & Row() < Expr(Row())))));

// find next non-missing row

Eval(Eval Expr(r2 = Min(dt << get rows where(!Is Missing(y[]) & Row() > Expr(Row())))));

y[] = y[r1] + (y[r2] - y[r1]) / (x[r2] - x[r1]) * (x[] - x[r1]);

)

)

);

Spot on MS,

Cheers!