turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Select records having frequency count greater than N

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 19, 2010 8:43 AM
(1264 views)

Hello fellow JMPers,

I have a dataset with a categorical variable X. I want to select rows where the frequency count of variable X is greater than a threshold. For example I only want to see rows where the count is 5 or higher. There could be 100-200 different values of X, so a simple distribution will be too busy to do a graphical selection from.

I've tried various things:

1. Tabulation where I put X into the drop zone for rows. After converting the results to a table I have X in column 1 and the count (N) in column 2. I join this back to the original table using X as my join variable. Now I have the counts for X back in my original table and I can select rows using Rows > Row Selection > Select Where (N > threshold). This works but has too many steps to sell to users.

2. Pareto plot of X. Sorts by the count which is nice. Any way to rotate it? The brush tool doesn't seem to work here. So it's good to look but I can't seem to conveniently select bars having a count of (threshold) or higher.

Thanks in advance for your help!

Regards,

Peter

I have a dataset with a categorical variable X. I want to select rows where the frequency count of variable X is greater than a threshold. For example I only want to see rows where the count is 5 or higher. There could be 100-200 different values of X, so a simple distribution will be too busy to do a graphical selection from.

I've tried various things:

1. Tabulation where I put X into the drop zone for rows. After converting the results to a table I have X in column 1 and the count (N) in column 2. I join this back to the original table using X as my join variable. Now I have the counts for X back in my original table and I can select rows using Rows > Row Selection > Select Where (N > threshold). This works but has too many steps to sell to users.

2. Pareto plot of X. Sorts by the count which is nice. Any way to rotate it? The brush tool doesn't seem to work here. So it's good to look but I can't seem to conveniently select bars having a count of (threshold) or higher.

Thanks in advance for your help!

Regards,

Peter

2 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

One possible solution:

Table --> Summary: Group by X

This will result in a table of X and frequency of each value of X. This table will be linked to the original table.

In the summary table, "Select Where" X < 5. This will highlight all the rows where X < 5 on the summary table AND the original table. Now you can Row --> Exclude, Row --> Hide, on the summary table, and that will reflect on the original table, too.

Hope that helps.

Table --> Summary: Group by X

This will result in a table of X and frequency of each value of X. This table will be linked to the original table.

In the summary table, "Select Where" X < 5. This will highlight all the rows where X < 5 on the summary table AND the original table. Now you can Row --> Exclude, Row --> Hide, on the summary table, and that will reflect on the original table, too.

Hope that helps.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

That's perfect! Didn't think to try the Summary option. Thanks!