Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Excluding data from formula calculations

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 22, 2013 11:34 PM
(4291 views)

I am doing a bunch of calculations using the "Col" type statistical functions (e.g. Col Mean). My problem is that my data set contains some bad data, that I want to exclude from the calculations. By simply selecting the relevant rows and choosing exclude, I apparently do not affect the formula results. So my question is this:

How do I exclude these bad data from the calculations without removing them from the data set entirely?

BR

Jesper

Jesper

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 23, 2013 12:01 AM
(6931 views)
| Posted in reply to message from JesperJohansen 05/23/2013 02:34 AM

Try this construction in a column formula. If included, return mean by excluded state (0/1), else do nothing.

If**(** !Excluded**()**, Col Mean**(** :Column 1, Excluded**()** **))**

6 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 23, 2013 12:01 AM
(6932 views)
| Posted in reply to message from JesperJohansen 05/23/2013 02:34 AM

Try this construction in a column formula. If included, return mean by excluded state (0/1), else do nothing.

If**(** !Excluded**()**, Col Mean**(** :Column 1, Excluded**()** **))**

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Thank you. That in fact solved my problem. But is there no way to do it without incorporating the row-state into the formula? It seems strange to me that I need to tell JMP to exclude already excluded data.

BR

Jesper

Jesper

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 23, 2013 1:52 AM
(4068 views)
| Posted in reply to message from JesperJohansen 05/23/2013 04:34 AM

No, not with the pre-evaluated statistics functions (see note on p. 304 in JMP 10 scripting guide).

Summarize() accounts for excluded rows but is less efficient than e.g. col mean, and in my experience when Summarize() is used in column formulas the result is not automatically uppdated if rowstates are changed.

But in principle this formula should give the same result as the trick above (except that that the result will be also in excluded rows), but formulas may have to be forced to reevaluate upon changes in the datatable.

summarize**(**m=mean**(:**Column 1**))**; m

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Just so I understand completely what I am doing:

The first "Excluded()" entry ensures that results are only displayed in non-excluded rows, while the second ensures that the mean (in the above case) is calculated for non-excluded and excluded rows separately. If I remove the "If" function but leave in the "Excluded()" clause in the "Col Mean()" function, I will get the same result in the non-excluded rows, but I will get values in the excluded rows as well (here only the mean for excluded data).

Is that correct?

BR

Jesper

Jesper

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 23, 2013 2:14 AM
(4068 views)
| Posted in reply to message from JesperJohansen 05/23/2013 05:06 AM

That's correct. The optional second (and third etc..) argument in the col stat functions is reserved for a by-column, i.e. calculates the statistics for each group based on the values in a separate column. But it apparently also works with a binary argument like Excluded(), and not only with table columns.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Thanks ms. That's the best formula which I found concerning excluded().