Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Filtering Data

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 14, 2020 5:47 AM
(268 views)

I am trying to remove some data from graphs due to a fault in the measurement equipment at certain time increments and also differentiate between two sections of the data. I have attached an image and file to help with solving this problem.

The blue oval of data needs to be removed and the red data needs to be differentiated from the rest of the data.

Can someone recommend ways to do this?

Thank you for the help.

1 ACCEPTED SOLUTION

Accepted Solutions

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi,

I now understand better. You can still use my low tech solution with the following modifications:

1) Exclude and Hide all the measures labelled at Peak (you already have a way to identify and remove these Peak values)

2) In the Fit Y by X platform, Set Y to Tmeas1, X to ATimeStamp, and By to CircleInc1

3) While holding down CTRL key, select Fit Line (or an alternative model Quadratic, Cubic,...)

4) While holding down the CTRL key in the Fit Line sub-menu (red triangle), select "Save Residual"

5) In the new column containing the Residuals for each CircleIInc1 value, exclude and hide the values that deviate from the the line/curve fit by whatever threshold you decide

I realize it is not perfect because your data does not seem to fit one type of curve model across all values of CircleInc1 values. May be, if this does not meet your requirement, you may want to experiment with the Flexible Fit > Local Smoother options using the same Save residuals approach

I now understand better. You can still use my low tech solution with the following modifications:

1) Exclude and Hide all the measures labelled at Peak (you already have a way to identify and remove these Peak values)

2) In the Fit Y by X platform, Set Y to Tmeas1, X to ATimeStamp, and By to CircleInc1

3) While holding down CTRL key, select Fit Line (or an alternative model Quadratic, Cubic,...)

4) While holding down the CTRL key in the Fit Line sub-menu (red triangle), select "Save Residual"

5) In the new column containing the Residuals for each CircleIInc1 value, exclude and hide the values that deviate from the the line/curve fit by whatever threshold you decide

I realize it is not perfect because your data does not seem to fit one type of curve model across all values of CircleInc1 values. May be, if this does not meet your requirement, you may want to experiment with the Flexible Fit > Local Smoother options using the same Save residuals approach

Thierry R. Sornasse

5 REPLIES 5

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Filtering Data

Here is a low tech possible solution

1) IN the Fit Y by X platform, select "Fit Line"

2) Under the "Fit Line" new menu, select "Save Predicted" (assuming that your data is linearly related to the time measure)

3) Create a new column with the Formula ABS (PREDICTED Y - Y)

4) Select the outliers according to your preference

The "Control Chart" method would probably offer additional features but I'm not expert in that platform

1) IN the Fit Y by X platform, select "Fit Line"

2) Under the "Fit Line" new menu, select "Save Predicted" (assuming that your data is linearly related to the time measure)

3) Create a new column with the Formula ABS (PREDICTED Y - Y)

4) Select the outliers according to your preference

The "Control Chart" method would probably offer additional features but I'm not expert in that platform

Thierry R. Sornasse

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Filtering Data

I believe that by using the Lasso tool you can select the data points from the graph, and since the selected points are reflected in the data table as selected rows, you can then do simple data table manipulations to get done what you want.

1. Select the Lasso tool

2. Lasso all of the data points you show in your chart as being in the Blue Oval.

3. Go to the data table's RowState Column(the column that has the row numbers in it). Right click on one of the rows that has been selected by your Lasso tool, and select "Hide and Exclude", or if you want, you can select "Delete Rows".

For the second issue of Binning:

1. Create a new Column in the data table, and make it a character data type

2. In row one of the new type in the value you want to use to indicate for the rows shown in the red oval on your chart.

3. Copy that value into the Paste buffer, and then erase the value from row 1. We did this operation just to get the value into the paste buffer.

4. Now go to the graph, and select the Lasso tool, and Lasso all of the data points from the red oval.

5. Go to the data table which now has all of the rows selected for the points in the red oval. Hover over one of the cells(rows) in the new column that has been selected.

6. Right click and select "Paste"

7. All of the selected cells for the new column will now have the value you placed into the paste buffer.

Jim

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Filtering Data

Don't forget about the **Rows->Row Selection->Name Selection in Column** option. This will let you select points in any graph and then quickly create a column to assign the selected (and unselected) rows to a category.

How to group points in a plot and assign categories in data table (3 or more categories) has a nice video showing how to use Name Selection in Column.

-Jeff

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Filtering Data

I am sorry that wasn't clearer before. I have hundreds of these graphs and am trying to find a less manual way of finding the outliers and peak values. If you open up the attached document, you can see a small set of the data that I am working with and the variations in the runs.

Thank you.

Highlighted
Hi,

I now understand better. You can still use my low tech solution with the following modifications:

1) Exclude and Hide all the measures labelled at Peak (you already have a way to identify and remove these Peak values)

2) In the Fit Y by X platform, Set Y to Tmeas1, X to ATimeStamp, and By to CircleInc1

3) While holding down CTRL key, select Fit Line (or an alternative model Quadratic, Cubic,...)

4) While holding down the CTRL key in the Fit Line sub-menu (red triangle), select "Save Residual"

5) In the new column containing the Residuals for each CircleIInc1 value, exclude and hide the values that deviate from the the line/curve fit by whatever threshold you decide

I realize it is not perfect because your data does not seem to fit one type of curve model across all values of CircleInc1 values. May be, if this does not meet your requirement, you may want to experiment with the Flexible Fit > Local Smoother options using the same Save residuals approach

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

I now understand better. You can still use my low tech solution with the following modifications:

1) Exclude and Hide all the measures labelled at Peak (you already have a way to identify and remove these Peak values)

2) In the Fit Y by X platform, Set Y to Tmeas1, X to ATimeStamp, and By to CircleInc1

3) While holding down CTRL key, select Fit Line (or an alternative model Quadratic, Cubic,...)

4) While holding down the CTRL key in the Fit Line sub-menu (red triangle), select "Save Residual"

5) In the new column containing the Residuals for each CircleIInc1 value, exclude and hide the values that deviate from the the line/curve fit by whatever threshold you decide

I realize it is not perfect because your data does not seem to fit one type of curve model across all values of CircleInc1 values. May be, if this does not meet your requirement, you may want to experiment with the Flexible Fit > Local Smoother options using the same Save residuals approach

Thierry R. Sornasse