It’s World Statistics Day! To honor the theme of the day, the JMP User Community is having conversations about the importance of trust in statistics and data. And we want to hear from you! Tell us the steps you take to ensure that your data is trustworthy.
Choose Language Hide Translation Bar
Highlighted
bio_grad
Level III

Calculating linear regression slope in formula

Hello,

 

I was wondering if there was a way to calculate the linear regression slope of column data within a data table formula.

A simple example of the table is below. The data is ordered and I would like to calculate the slope of a fit line of volume over time using the current row's data and previous points to perform the regression.

The idea is that I would like to see the how the calculated slope changes with increasing the number of data points. From doing this manually, the slope values end up approaching a given value and stay within +/-5% after 20 or so points.

InstanceTime (s)Volume (mL)Slope (mL/s)
100n/a
250.5m1
3100.8m2
............
NTVmN-1



If that is possible, my next challenge would be to set this up so that I can calculate the slope formula so that it uses N data points on either side of the current row for a symmetrical moving calculation instead.

 

 

Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Calculating linear regression slope in formula

You could also use a column formula. I opened Big Class.jmp from the Sample Data folder. (Select Help > Sample Data Library > Big Class.jmp > Open) I crated a new numeric column called moving slope. This example computes the moving slope with a span of 3 for Y = weight and X = height with the following column formula:

If( Row() > 1,
	x = J( Row(), 1, 1 ) || :height[Index( 1, Row() )]`;
	y = :weight[Index( 1, Row() )]`;
	(Inv( x` * x ) * x` * y)[2];
,
	Empty()
)

 Then plot moving slope versus row in Graph Builder.

Learn it once, use it forever!

View solution in original post

22 REPLIES 22
Highlighted
txnelson
Super User

Re: Calculating linear regression slope in formula

Here is a script that gives you the output you want, it just does it a different way.  It uses the Fit Model Platform and loops through it several time, increasing the N.

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA\semiconductor capability.jmp" );

// Create an output table to hold the results
dtOut = New Table( "summary",
	New Column( "Instance" ),
	New Column( "N" ),
	New Column( "RSquare" ),
	New Column( "RSM" ),
	New Column( "Intercept" ),
	New Column( "b1" ),
	New Column( "b2" )
);

Loop Counter = 0;

// Loop across the different sizes
For( i = 5, i <= 111, i = i + 5,
	Loop Counter++;
	
	// Create a random sample data table from the original data table
	dt << New Column( "random", formula( Random Uniform() ) );
	dt << sort( by( dt:random ), , Order( Ascending ), replace table( 1 ) );
	dt << delete columns( "Random" );
	dt << select where( Row() <= i );
	dtSample = dt << subset( invisible, selected rows( 1 ), selected columns( 0 ) );
	
	// Run the regression
	fm = dtSample << Fit Model(
		Y( :NPN1 ),
		Effects( :PNP1, :PNP2 ),
		Personality( "Standard Least Squares" ),
		Emphasis( "Minimal Report" ),
		Run(
			:NPN1 << {Summary of Fit( 1 ), Analysis of Variance( 1 ), Parameter Estimates( 1 ),
			Lack of Fit( 0 ), Plot Actual by Predicted( 0 ), Plot Regression( 0 ),
			Plot Residual by Predicted( 0 ), Plot Effect Leverage( 0 )}
		)
	);
	
	// Capture the output data
	dtOut << Add Rows( 1 );
	dtOut:Instance[N Rows( dtOut )] = Loop Counter;
	dtOut:N[N Rows( dtOut )] = i;
	dtOut:RSquare[N Rows( dtOut )] = (Report( fm )["Summary of Fit"][1][2] << get)[1];
	dtOut:RSM[N Rows( dtOut )] = (Report( fm )["Summary of Fit"][1][2] << get)[3];
	dtOut:Intercept[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"][1][3] << get)[1];
	dtOut:b1[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"][1][3] << get)[2];
	dtOut:b2[N Rows( dtOut )] = (Report( fm )["Parameter Estimates"][1][3] << get)[3];
	
	// Clean up after our selfs
	fm << close window;
	Close( dtSample, nosave );
	
);
Jim
Highlighted
TRR21
Level III

Re: Calculating linear regression slope in formula

Dear @txnelson, can you show how to tweak the above code to run fit models on rows selections based on each 'wafer' id/value, instead of the random sampling? Thanks!
Highlighted
jerry_cooper
Staff (Retired)

Re: Calculating linear regression slope in formula

Try plugging this formula into a new formula column (this will calculate the slope up to the current row):

(Row() * Summation( i = 1, Row(), :Time[i] * :Volume[i] )
-Summation( i = 1, Row(), :Time[i] ) * Summation( i = 1, Row(), :Volume[i] )) / (
Row() * Summation( i = 1, Row(), :Time[i] ^ 2 )
-Summation( i = 1, Row(), :Time[i] ) ^ 2)

To do the second part of your question would only require changing the index on the summation, i.e. from i=Row()-N to Row()+N.


Hope this helps.

-Jerry


Highlighted

Re: Calculating linear regression slope in formula

You could also use a column formula. I opened Big Class.jmp from the Sample Data folder. (Select Help > Sample Data Library > Big Class.jmp > Open) I crated a new numeric column called moving slope. This example computes the moving slope with a span of 3 for Y = weight and X = height with the following column formula:

If( Row() > 1,
	x = J( Row(), 1, 1 ) || :height[Index( 1, Row() )]`;
	y = :weight[Index( 1, Row() )]`;
	(Inv( x` * x ) * x` * y)[2];
,
	Empty()
)

 Then plot moving slope versus row in Graph Builder.

Learn it once, use it forever!

View solution in original post

Highlighted
bio_grad
Level III

Re: Calculating linear regression slope in formula

This is exactly what I was looking for. I had no idea these operations were available!

 

Thank you!

Highlighted
Steve_Kim
Level III

Re: Calculating linear regression slope in formula

wow~~  Awesome!!!!!!

Thank you so much!!!!! : )

Highlighted
Frank_Bowler
Level I

Re: Calculating linear regression slope in formula

Hi

 

I am trying to apply your formula to calculate the moving slope with a window of 8, by which I mean (for Row() > 7), I want to calculate the slope of the linear regression for the first set of x and y data points [1 to 8], and then the next set of 8 data points incrementing by 1 [2 to 9], and so on. I am trying to calculate the maximum slope to get the initial rate of an enzymatic reaction.

 

Please can you tell me how you would modify your script to achieve this? I have tested your formula and I obtain the slope for all data points from Row 1 to Row n, rather than a moving window of defined size.

 

Thank you in advance

 

 

 

Highlighted

Re: Calculating linear regression slope in formula

This modified formula produces a moving slope of span 8:

If( Row() > 7,
	x = J( 8, 1, 1 ) || :height[Index( Row() - 7, Row() )]`;
	y = :weight[Index( Row() - 7, Row() )]`;
	(Inv( x` * x ) * x` * y)[2];
)
Learn it once, use it forever!
Highlighted
Frank_Bowler
Level I

Re: Calculating linear regression slope in formula

Excellent, that works, thank you

Article Labels

    There are no labels assigned to this post.