BookmarkSubscribe
Choose Language Hide Translation Bar

Calculating linear regression slope in formula

Hello,

I was wondering if there was a way to calculate the linear regression slope of column data within a data table formula.

A simple example of the table is below. The data is ordered and I would like to calculate the slope of a fit line of volume over time using the current row's data and previous points to perform the regression.

The idea is that I would like to see the how the calculated slope changes with increasing the number of data points. From doing this manually, the slope values end up approaching a given value and stay within +/-5% after 20 or so points.

 Instance Time (s) Volume (mL) Slope (mL/s) 1 0 0 n/a 2 5 0.5 m1 3 10 0.8 m2 ... ... ... ... N T V mN-1

If that is possible, my next challenge would be to set this up so that I can calculate the slope formula so that it uses N data points on either side of the current row for a symmetrical moving calculation instead.

Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Calculating linear regression slope in formula

You could also use a column formula. I opened Big Class.jmp from the Sample Data folder. (Select Help > Sample Data Library > Big Class.jmp > Open) I crated a new numeric column called moving slope. This example computes the moving slope with a span of 3 for Y = weight and X = height with the following column formula:

If( Row() > 1,
x = J( Row(), 1, 1 ) || :height[Index( 1, Row() )]`;
y = :weight[Index( 1, Row() )]`;
(Inv( x` * x ) * x` * y);
,
Empty()
)

Then plot moving slope versus row in Graph Builder.

Learn it once, use it forever!
12 REPLIES 12

Re: Calculating linear regression slope in formula

Here is a script that gives you the output you want, it just does it a different way.  It uses the Fit Model Platform and loops through it several time, increasing the N.

Names Default To Here( 1 );
dt = Open( "\$SAMPLE_DATA\semiconductor capability.jmp" );

// Create an output table to hold the results
dtOut = New Table( "summary",
New Column( "Instance" ),
New Column( "N" ),
New Column( "RSquare" ),
New Column( "RSM" ),
New Column( "Intercept" ),
New Column( "b1" ),
New Column( "b2" )
);

Loop Counter = 0;

// Loop across the different sizes
For( i = 5, i <= 111, i = i + 5,
Loop Counter++;

// Create a random sample data table from the original data table
dt << New Column( "random", formula( Random Uniform() ) );
dt << sort( by( dt:random ), , Order( Ascending ), replace table( 1 ) );
dt << delete columns( "Random" );
dt << select where( Row() <= i );
dtSample = dt << subset( invisible, selected rows( 1 ), selected columns( 0 ) );

// Run the regression
fm = dtSample << Fit Model(
Y( :NPN1 ),
Effects( :PNP1, :PNP2 ),
Personality( "Standard Least Squares" ),
Emphasis( "Minimal Report" ),
Run(
:NPN1 << {Summary of Fit( 1 ), Analysis of Variance( 1 ), Parameter Estimates( 1 ),
Lack of Fit( 0 ), Plot Actual by Predicted( 0 ), Plot Regression( 0 ),
Plot Residual by Predicted( 0 ), Plot Effect Leverage( 0 )}
)
);

// Capture the output data
dtOut << Add Rows( 1 );
dtOut:Instance[N Rows( dtOut )] = Loop Counter;
dtOut:N[N Rows( dtOut )] = i;
dtOut:RSquare[N Rows( dtOut )] = (Report( fm )["Summary of Fit"] << get);
dtOut:RSM[N Rows( dtOut )] = (Report( fm )["Summary of Fit"] << get);
dtOut:Intercept[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"] << get);
dtOut:b1[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"] << get);
dtOut:b2[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"] << get);

// Clean up after our selfs
fm << close window;
Close( dtSample, nosave );

);
Jim

Re: Calculating linear regression slope in formula

Dear @txnelson, can you show how to tweak the above code to run fit models on rows selections based on each 'wafer' id/value, instead of the random sampling? Thanks!

Re: Calculating linear regression slope in formula

Try plugging this formula into a new formula column (this will calculate the slope up to the current row):

(Row() * Summation( i = 1, Row(), :Time[i] * :Volume[i] )
-Summation( i = 1, Row(), :Time[i] ) * Summation( i = 1, Row(), :Volume[i] )) / (
Row() * Summation( i = 1, Row(), :Time[i] ^ 2 )
-Summation( i = 1, Row(), :Time[i] ) ^ 2)

To do the second part of your question would only require changing the index on the summation, i.e. from i=Row()-N to Row()+N.

Hope this helps.

-Jerry

Re: Calculating linear regression slope in formula

You could also use a column formula. I opened Big Class.jmp from the Sample Data folder. (Select Help > Sample Data Library > Big Class.jmp > Open) I crated a new numeric column called moving slope. This example computes the moving slope with a span of 3 for Y = weight and X = height with the following column formula:

If( Row() > 1,
x = J( Row(), 1, 1 ) || :height[Index( 1, Row() )]`;
y = :weight[Index( 1, Row() )]`;
(Inv( x` * x ) * x` * y);
,
Empty()
)

Then plot moving slope versus row in Graph Builder.

Learn it once, use it forever!

Re: Calculating linear regression slope in formula

This is exactly what I was looking for. I had no idea these operations were available!

Thank you!

Re: Calculating linear regression slope in formula

wow~~  Awesome!!!!!!

Thank you so much!!!!! : )

Re: Calculating linear regression slope in formula

Hi

I am trying to apply your formula to calculate the moving slope with a window of 8, by which I mean (for Row() > 7), I want to calculate the slope of the linear regression for the first set of x and y data points [1 to 8], and then the next set of 8 data points incrementing by 1 [2 to 9], and so on. I am trying to calculate the maximum slope to get the initial rate of an enzymatic reaction.

Please can you tell me how you would modify your script to achieve this? I have tested your formula and I obtain the slope for all data points from Row 1 to Row n, rather than a moving window of defined size.

Thank you in advance

Re: Calculating linear regression slope in formula

This modified formula produces a moving slope of span 8:

If( Row() > 7,
x = J( 8, 1, 1 ) || :height[Index( Row() - 7, Row() )]`;
y = :weight[Index( Row() - 7, Row() )]`;
(Inv( x` * x ) * x` * y);
)
Learn it once, use it forever!

Re: Calculating linear regression slope in formula

Excellent, that works, thank you