Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- JMP User Community
- :
- Discussions
- :
- Integratation of GrubbsOutlierTest2 into multivariable data table

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Integratation of GrubbsOutlierTest2 into multivariable data table

Created:
Jan 16, 2019 6:20 AM
| Last Modified: Jan 18, 2019 8:40 AM
(1423 views)

Previously stated, I am dealing with a complex set of data tables joined using SQL scripts. With the help from this community I have managed to create an "application" based script that creates plots and distributions based on a user selected conditions.

The last portion of issues is dealing with those pesky outliers. As usual, my problem lies with looping through the test conditions identifying the outliers and then removing them from the data set. I understand that removing them can be risky but given our complex tests our parts are in a tight distribution or catastrophically fails. I am trying to impliment if g > g0 go find the outliers and delete them.

Attached: 3 subsets from the main table and the main table.

I found this THREAD that could do it in a single column but again, how does one loop through everything?

I have quite a few scripts that loop through some of these conditions but I cannot figure out the correct combination.

Again thanks for the help.

5 REPLIES 5

Highlighted
##

Let's focus on your table #1 below, and on just 'test_1' within that. I think you are saying that you want to consider each of the 3 * 2 * 16 = 96 'looping conditions' to define a group, and to look for outliers within each of these groups. Is that correct, please? But the attached table shows that #1 has 105 such groups, not 96?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Integratation of GrubbsOutlierTest2 into multivariable data table

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Integratation of GrubbsOutlierTest2 into multivariable data table

I apologize for me not communicating this clearly. test_1 includes 3 different set_id's {7000,7100,7200}.

set_id 7000 has 1 channel (switching) + 2 supplies = 2 test outputs

set_id 7100 has 16 channels + 2 supplies = 32 test outputs

set_id 7200 has 16 channels + 2 supplies = 32 test outputs

Total = 66

I got ahead of myself and forgot that set_id: 7000 is just 2 conditions. all 3 tests = 105

I have also been giving it some thought and need to rewrite the original post. I think it is going to be easier to subset each of the tests ( _1, _2, _3) and then do what i need to do with the distriubtions. The main table is good for my graph plots. The subsets will be the tests and test conditions that we screen at...

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Integratation of GrubbsOutlierTest2 into multivariable data table

The script for the Grubb's outlier test can use a grouping variable in the By analysis role.

Learn it once, use it forever!

Highlighted
##

Yes sir, however i would like to use that in an automated form that cacluates the g > g0. Then with that calculation for each test By: channel locates the outliers and then excludes them from the data set instead of manually doing it.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Integratation of GrubbsOutlierTest2 into multivariable data table

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Integratation of GrubbsOutlierTest2 into multivariable data table

Created:
Jan 21, 2019 7:57 AM
| Last Modified: Jan 21, 2019 8:00 AM
(1321 views)
| Posted in reply to message from Yngeinstn 01-18-2019

Update: I managed to get GrubbsOutlier to sort of work. I can take one test condition and get what i need. I grabbed a For Loop to capture two test conditions however, it hangs.

You can run this script against the sample data _ 1 wafers (test_3).jmp and see the output which i am looking for.. I would appreciate any help with adding the 2nd Trmode and even as far as doing this on a dataset with multiple wafers sample data _ 6 wafers (test_3).jmp

After this is resolved i am going to try to do a If( g > g0 ) Loc( < the outliers> ) then delete them from the dataset.

Thank You

```
dt = Current Data Table();
dtsum = dt << Summary(
Group( :channel, :trmode ),
Interquartile Range( :Output ),
Freq( "None" ),
Weight( "None" ),
// invisible
);
Current Data Table( dt );
```

// <Insert For Loop Here>
dist = dt << Distribution(
Y( Column( "Output" )),
By( Column( "channel" ) ),
Normal Quantille Plot( 1 ),
Fit Distribution(
Normal(
Goodness of FIt( 1 )
)
),

// Added a For() Loop located at the botton of the screen, the script hangs up

// Where( dt :trmode == mode )
Where( dt :trmode == "Tx" )

);
distr = dist << Report;
bCol = Column( "channel ");
Summarize( group = By( bCol ));
yy = Column( "Output" ) << Get As Matrix;
exRows = dt << Get Excluded Rows();
yy[exRows] = .;
For( i = 1, i <= N Items( group ), i++,
groupName = Trim( Word( 2, distr[i][OutlineBox(1)] << Get Title, "=" ) );
getRows = dt << Get Rows Where( bCol[] == groupName );
yVal = yy[getRows];
yVal[Loc( Is Missing( yVal ) )] = [];
n = N Row( yVal );
a = 0.05;
t0Sqr = t Quantile( 1 - a/(2*n), n-2 )^2;
g = Maximum( Abs( yVal - Mean( yVal ) ) ) / Std Dev( yVal );
g0 = ((n-1)/Sqrt(n)) * Sqrt( t0Sqr / (n - 2 + t0Sqr) );
distr[i][Outline Box(2)] << Append(
Outline Box( "Grubbs' Outlier Test",
Table Box(
String Col Box( "Statistic", {"G", "G("||Char(a)||")"} ),
Number Col Box( "Estimate", Matrix( {g, g0} ) )
),
Text Box(
If( g>g0,
"Outlier detected",
"No outlier detected"
)
)
)
);
);
// For Loop I was trying to use
// For( i = 1, i <= N Rows( dtsum ), i++,
// mode = dtsum:trmode[i];
// < Insert Script From Above >
// );

Article Labels

There are no labels assigned to this post.