turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Rolling t-test and F-test down a column

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Jul 14, 2016 6:56 AM
(1516 views)

I have one minute level data from a process vessel. The level data contains some noise. Using the level data, I want to determine is the vessel filling, emptying, or holding steady.

I have used simple one-row Dif(x,1) > y and even two point (Lag(x,-1) - Lag(x,1)) > y to find the changes from one state to another. Even with the latter there are false positives. From one minute to another the vessel can be filled. Some of the fills/empties last ~10 min.

I spoke with a process control colleague about this and he suggested a method that is used to find (un)steady-states:

- do a t-test to compare the means of row(1-10) with rows(11-20) to see if the means are same/different to determine fill/empty/hold
- do an F-test of the same data to see the variances are different to determine un/steady state
- repeat this down every row in the column (i have 785k rows)

He did this using Excel using the T.TEST() and F.TEST() functions.

I looked in the scripting index and could not find an obvious way to accomplish this.

Any suggestions?

2 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Jul 14, 2016 11:58 AM
(1380 views)

hi MarkSChahl

first i have to tell you that i think this violates quite a few assumptions of the t test. otherwise, try the following script.

once it is done you will have a connecting letter report which you can sort by the levels of column 2 in order to get the summary in the order of time.

best,

ron

New Table**(** "DATA",

Add Rows**(** **500** **)**,

New Column**(** "Column 1",

Numeric,

Continuous,

Format**(** "Best", **12** **)**,

Formula**(** Sin**(** Row**()** / **100** **)** + Random Normal**()** **)**

**)**,

New Column**(** "Column 2",

Numeric,

Ordinal,

Format**(** "Best", **12** **)**,

Formula**(** Floor**(** Row**()** / **10** **)** **)**

**)**

**)**;

output = Oneway**(**

Y**(** :Column 1 **)**,

X**(** :Column 2 **)**,

Each Pair**(** **1**, Connecting Letters Report**(** **1** **)** **)**;

Name**(** "Means/Anova" **)(****0)**,

t Test**(** **0** **)**,

Compare Densities**(** **0** **)**,

Box Plots**(** **1** **)**,

Mean Diamonds**(** **1** **)**,

Points Jittered**(** **1** **)**,

**)**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Jul 16, 2016 8:16 AM
(1380 views)

Hi Mark,

Another approach that might accomplish the objective would be to do a Bivariate plot and then select Flexible->Fit Spline->0.01,flexible from the bivariate report's red triangle menu. Use the slider to smooth out the curve to your liking (you may need to zoom in on various parts of the graph depending upon how dense your data are), and then select Save Predicteds from the fit menu. Then go to the formula column that was created and take the derivative with respect to your time variable. The result will be the slope of the curve at any given observation. From there, it's just a matter of deciding an appropriate interval for what you want to consider as 0 slope, i.e. "holding steady". Anything less would be "emptying" and anything more would be "filling".

Just a note about response times: in version 12 each action, opening the formula editor and each subsequent click, took ~20-30 seconds for a data table with ~45K rows. Response times in the version 13 beta were instantaneous.

Hope this helps.

-Jerry