Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Comparing multiple rows in a large table & keeping...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 4, 2010 9:37 AM
(1173 views)

I am somewhat new to JMP...I have a large 1,000,000+ row table (13 columns) that I need to consolidate in the following way:

I have multiple rows for observations on a single case- (not each case is observed the same number of times- some have 3 observations, some 10). I want to construct a summary table, keeping the observation with the largest values for each unique case.

Thanks!

-t

6 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 4, 2010 12:41 PM
(1069 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 7, 2010 2:49 AM
(1069 views)

In the Summary dialog, choose to group by your case id and then select the variables that you want in the summary table and choose "Max" in the Statistics drop-down list.

That would give you a table with one row for each case and the highest values for each variable. A column with the number of observations (= nr of rows, incl. those with any missing values) for each case will also be generated automatically.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 7, 2010 6:50 AM
(1069 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2010 2:14 AM
(1069 views)

My hint above assumes that the multiple observations are vertically arranged, i.e. a case with three obs. has three rows, and a case with ten obs. has ten rows etc. However if the number of observations is distributed across columns, with empty cells for cases with less then the max nr of observations, the table must first be stacked in order to use Summary the way I described.

Or maybe I have misunderstood the problem completely...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2010 5:01 AM
(1069 views)

ID Y1 Y2

a 1 2

a 2 1

then the resulting summary table will contain one row:

a 2 2 2

which is not in the original table. The first "2" is the row count. The remaining values comprise the "row" which is not in the original table.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2010 3:19 PM
(1069 views)

There may still be a problem with ties. Then additional criteria must be considered e.g. based on the values in the other columns.