Subscribe Bookmark RSS Feed

Calculating linear regression slope in formula

bio_grad

Community Trekker

Joined:

Jun 10, 2016

Hello,

 

I was wondering if there was a way to calculate the linear regression slope of column data within a data table formula.

A simple example of the table is below. The data is ordered and I would like to calculate the slope of a fit line of volume over time using the current row's data and previous points to perform the regression.

The idea is that I would like to see the how the calculated slope changes with increasing the number of data points. From doing this manually, the slope values end up approaching a given value and stay within +/-5% after 20 or so points.

InstanceTime (s)Volume (mL)Slope (mL/s)
100n/a
250.5m1
3100.8m2
............
NTVmN-1



If that is possible, my next challenge would be to set this up so that I can calculate the slope formula so that it uses N data points on either side of the current row for a symmetrical moving calculation instead.

 

 

Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
markbailey

Staff

Joined:

Jun 23, 2011

Solution

You could also use a column formula. I opened Big Class.jmp from the Sample Data folder. (Select Help > Sample Data Library > Big Class.jmp > Open) I crated a new numeric column called moving slope. This example computes the moving slope with a span of 3 for Y = weight and X = height with the following column formula:

If( Row() > 1,
	x = J( Row(), 1, 1 ) || :height[Index( 1, Row() )]`;
	y = :weight[Index( 1, Row() )]`;
	(Inv( x` * x ) * x` * y)[2];
,
	Empty()
)

 Then plot moving slope versus row in Graph Builder.

Learn it once, use it forever!
5 REPLIES
txnelson

Super User

Joined:

Jun 22, 2012

Here is a script that gives you the output you want, it just does it a different way.  It uses the Fit Model Platform and loops through it several time, increasing the N.

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA\semiconductor capability.jmp" );

// Create an output table to hold the results
dtOut = New Table( "summary",
	New Column( "Instance" ),
	New Column( "N" ),
	New Column( "RSquare" ),
	New Column( "RSM" ),
	New Column( "Intercept" ),
	New Column( "b1" ),
	New Column( "b2" )
);

Loop Counter = 0;

// Loop across the different sizes
For( i = 5, i <= 111, i = i + 5,
	Loop Counter++;
	
	// Create a random sample data table from the original data table
	dt << New Column( "random", formula( Random Uniform() ) );
	dt << sort( by( dt:random ), , Order( Ascending ), replace table( 1 ) );
	dt << delete columns( "Random" );
	dt << select where( Row() <= i );
	dtSample = dt << subset( invisible, selected rows( 1 ), selected columns( 0 ) );
	
	// Run the regression
	fm = dtSample << Fit Model(
		Y( :NPN1 ),
		Effects( :PNP1, :PNP2 ),
		Personality( "Standard Least Squares" ),
		Emphasis( "Minimal Report" ),
		Run(
			:NPN1 << {Summary of Fit( 1 ), Analysis of Variance( 1 ), Parameter Estimates( 1 ),
			Lack of Fit( 0 ), Plot Actual by Predicted( 0 ), Plot Regression( 0 ),
			Plot Residual by Predicted( 0 ), Plot Effect Leverage( 0 )}
		)
	);
	
	// Capture the output data
	dtOut << Add Rows( 1 );
	dtOut:Instance[N Rows( dtOut )] = Loop Counter;
	dtOut:N[N Rows( dtOut )] = i;
	dtOut:RSquare[N Rows( dtOut )] = (Report( fm )["Summary of Fit"][1][2] << get)[1];
	dtOut:RSM[N Rows( dtOut )] = (Report( fm )["Summary of Fit"][1][2] << get)[3];
	dtOut:Intercept[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"][1][3] << get)[1];
	dtOut:b1[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"][1][3] << get)[2];
	dtOut:b2[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"][1][3] << get)[3];
	
	// Clean up after our selfs
	fm << close window;
	Close( dtSample, nosave );
	
);
Jim
jerry_cooper

Staff

Joined:

Jul 10, 2014

Try plugging this formula into a new formula column (this will calculate the slope up to the current row):

(Row() * Summation( i = 1, Row(), :Time[i] * :Volume[i] )
-Summation( i = 1, Row(), :Time[i] ) * Summation( i = 1, Row(), :Volume[i] )) / (
Row() * Summation( i = 1, Row(), :Time[i] ^ 2 )
-Summation( i = 1, Row(), :Time[i] ) ^ 2)

To do the second part of your question would only require changing the index on the summation, i.e. from i=Row()-N to Row()+N.


Hope this helps.

-Jerry


markbailey

Staff

Joined:

Jun 23, 2011

Solution

You could also use a column formula. I opened Big Class.jmp from the Sample Data folder. (Select Help > Sample Data Library > Big Class.jmp > Open) I crated a new numeric column called moving slope. This example computes the moving slope with a span of 3 for Y = weight and X = height with the following column formula:

If( Row() > 1,
	x = J( Row(), 1, 1 ) || :height[Index( 1, Row() )]`;
	y = :weight[Index( 1, Row() )]`;
	(Inv( x` * x ) * x` * y)[2];
,
	Empty()
)

 Then plot moving slope versus row in Graph Builder.

Learn it once, use it forever!
bio_grad

Community Trekker

Joined:

Jun 10, 2016

This is exactly what I was looking for. I had no idea these operations were available!

 

Thank you!

jameskim

Community Trekker

Joined:

May 17, 2012

wow~~  Awesome!!!!!!

Thank you so much!!!!! : )