- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Regression P-Values in a Column based on the Values from other Columns
Hi... I am trying to find an alternative to Nelson Rule number 3 (6 points increasing or decreasing) that uses a regression based on a number of successive points an alpha value and a resultant p-value to help me flag when a signal is large enough to overcome noise. This is would be used for tool wear types of control charts.
Suppose I had a data set that looks like below (for simplicity) where;
X is my independent variable and Y is the dependent. I would like to calculate the P-value of the regression of successive points (N=3 in the example below) to help me understand when there is a signal large enough to overcome the noise in regression.
Obviously, the p-values are in the table below to be exemplary
I recognize that N will likely need to be larger (although, of course it depends on the size of signal) .
Can I create a script that will do this so that I can play with N and alpha to help me understand when regression is different than the null? I don't really care about the annotation on the control chart if I can have the p-value in the table.
Thanks
X | Y | P-Value | Slope | SubGroup Size |
1 | 3 | |||
2 | 6 | |||
3 | 9 | 0.05 | 3 | |
4 | 12 | 0.05 | 3 | |
5 | 12 | 0.1 | 3 | |
6 | 12 | 0.54 | 3 | |
7 | 11 | 0.5 | 3 | |
8 | 13 | 0.1 | 3 | |
9 | 15 | 0.07 | 3 | |
10 | 17 | 0.05 | 3 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regression P-Values in a Column based on the Values from other Columns
It would be a simple matter to loop through a data table, and build one's x and y matrices and then perform a regression, using the Linear Regression() function. The decision of what values to place in the 2 matrices would be the item to vary.
Here is the example taken from the Scripting Index, for the Linear Regression() function
Names Default To Here( 1 );
/*Simple Linear Regression: y = intercept + beta * x + error*/
y = [3, 5, 7, 5];
X = [1, 2, 3, 4];
{Estimates, Std_Error, Diagnostics} =
Linear Regression( y, X, <<printToLog );
/*
t_ratio = Diagnostics["t_ratio"];
p_value = Diagnostics["p_value"];
RSquare = Diagnostics["RSquare"];
RSquare Adj = Diagnostics["RSquare Adj"];
*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regression P-Values in a Column based on the Values from other Columns
Adding to @txnelson, there was an earlier discussion about setting up a column formula to compute a moving slope. I suggest that you find it and read it for useful background. The basis for that solution is the function that Jim suggested here. You could expand the formula to result in a row state change that is then used by the control chart to identify where the signal of a change might have occurred.
There are many built-in features that can be used to implement this idea, and they do not all require a script.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regression P-Values in a Column based on the Values from other Columns
I am interested in the built in functions to create these new rows as I would prefer not to use a script. I did look up your solution to Moving Slope (copied below) but the solution is way over my head. I am too old (And too stupid) and can no longer do this kind of matrix math where you are multiplying by transposed matrices and then taking inverses... honestly, I can't follow this. Thus, why I would find a script more useful as they tend to be more linear.
But thanks.
If( Row() > 2,
x = J( Row(), 1, 1 ) || :MX[Index( 1, Row() )]`;
y = :MY[Index( 1, Row() )]`;
(Inv( x` * x ) * x` * y)[2];
)