Subscribe Bookmark RSS Feed

Replacing missing data by regressed values

guilhem

Community Trekker

Joined:

Jul 16, 2015

Hi

I have a time series for which the variable has missing values. I would like to replace the missing values by a linear model derived from the closest available values in time.

What would be the best way?

And if the script could do that to all the columns of my table (except the time), that would be even better

Thanks

Example

What I have

tOutput
010
1.
2.
430
10.
2010

What I want:

tOutput
010
115
220
430
1022.5
20

10

5 REPLIES
ron_horne

Super User

Joined:

Jun 23, 2011

try this for starters

Imputation Addin

guilhem

Community Trekker

Joined:

Jul 16, 2015

Thanks Ron but I don’t see anything in this that does interpolation?

txnelson

Super User

Joined:

Jun 22, 2012

here is a script that will do the task

Names Default To Here( 1 );

dt = Current Data Table();

For( rows = 1, rows <= N Rows( dt ), rows++,

       If( Rows == 1, counter = 0 );

       If( Is Missing( dt:output[rows] ) == 1,

              counter = counter + 1,

              If( counter > 0,

                     incr = (dt:output[Rows] - dt:output[Rows - counter - 1]) / (counter + 1);

                     For( i = Rows - counter, i <= Rows - 1, i++,

                           dt:output = dt:output[i - 1] + incr

                     );

                     counter = 0;

              )

       );

);

Jim
ms

Super User

Joined:

Jun 23, 2011

I had a similar problem some years ago and wrote a script to do imputation by "local regression".

Here's an updated version:


dt = Current Data Table();

n = N Rows(dt);

x = Column(1);// Column("t")

For(i = 2, i <= N Col(dt), i++,

    y = Column(i);

    For Each Row(

        If(1 < Row() < n & Is Missing(y[]),

         // find previous non-missing row

            Eval(Eval Expr(r1 = Max(dt << get rows where(!Is Missing(y[]) & Row() < Expr(Row())))));

          // find next non-missing row

            Eval(Eval Expr(r2 = Min(dt << get rows where(!Is Missing(y[]) & Row() > Expr(Row())))));

            y[] = y[r1] + (y[r2] - y[r1]) / (x[r2] - x[r1]) * (x[] - x[r1]);

        )

    )

   

);


guilhem

Community Trekker

Joined:

Jul 16, 2015

Spot on MS,

Cheers!