JMP User Community
- :
- Discussions
- :
Comparing multiple rows in a large table & keeping...

Jan 4, 2010 9:37 AM
(467 views)

I am somewhat new to JMP...I have a large 1,000,000+ row table (13 columns) that I need to consolidate in the following way:

I have multiple rows for observations on a single case- (not each case is observed the same number of times- some have 3 observations, some 10). I want to construct a summary table, keeping the observation with the largest values for each unique case.

Thanks!

-t

Jan 4, 2010 12:41 PM
Jan 7, 2010 2:49 AM
In the Summary dialog, choose to group by your case id and then select the variables that you want in the summary table and choose "Max" in the Statistics drop-down list.

That would give you a table with one row for each case and the highest values for each variable. A column with the number of observations (= nr of rows, incl. those with any missing values) for each case will also be generated automatically.

Jan 7, 2010 6:50 AM
Jan 8, 2010 2:14 AM
My hint above assumes that the multiple observations are vertically arranged, i.e. a case with three obs. has three rows, and a case with ten obs. has ten rows etc. However if the number of observations is distributed across columns, with empty cells for cases with less then the max nr of observations, the table must first be stacked in order to use Summary the way I described.

Or maybe I have misunderstood the problem completely...

Jan 8, 2010 5:01 AM
ID Y1 Y2

a 1 2

a 2 1

then the resulting summary table will contain one row:

a 2 2 2

which is not in the original table. The first "2" is the row count. The remaining values comprise the "row" which is not in the original table.

Jan 8, 2010 3:19 PM
There may still be a problem with ties. Then additional criteria must be considered e.g. based on the values in the other columns.