Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Highlighted
Maureen
Level II

help with consolidating data please

Hi all

I am what I consider an intermediate to advanced Excel user, but only a moderate JMP user.  I'm trying more and more to learn the formulas in JMP, but I still do a lot of data massage in Excel.  I have an issue I'd like to use JMP to solve if I could get some help with the formulas or table options needed.

I got a huge data dump of many years of laboratory results from our hospital archive - it's over a million rows, so right there is why I can't use Excel.  I am interested in five different tests - you can call them A, B, C, D, E.  I have been given a subject number and some demographics, (age, sex race) and the date and time of the results.  The problem is that tests A and B are in one panel, and C, D, and E are in a second panel, or even ordered individually.  So there might be two rows of data one with A and B and the second row with C, D, and E.  Because they are the same date, but a few minutes apart it has them separate.  I have figured out the short date formula which strips out the time.  Now what I'd like to do is to consolidate all five tests into a single row if they are from the same date on the same subject.  One thing that can potentially confuse the issue is that occasionally a single test, say, C is run more than once in a day.  So, ideally, the five tests that are closest together in time would end up on one row together and any repeated values later in that same day would still get a row (so not tossing any data that happens to duplicate on a day)

Can anyone help me understand how this would work in JMP?

thanks, Maureen

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted
Thierry_S
Level VI

Re: help with consolidating data please

Hi Maureen,

 

Assuming that your data is arranged in similar manner than in the attached mock table, you could reconcile your test data as follows:

1) Stack the A - E columns, keeping all other columns in the resulting table

2) Change the stacked data column to "Ordinal"

3) Select (using the filter tool) and delete all the empty rows

4) Split the "data" columns by "Label" and Group by PATIENT_ID and DATE

 

Of note, you will need to make sure that the assays performed multiple times on the same day have a distinct timestamp / Date value.

 

If this does not meet your need, ti would be helpful if you could share an anonymized or mock table showing how your data is currently arranged.

 

Best,

 

TS

Thierry R. Sornasse

View solution in original post

Highlighted
jerry_cooper
Staff (Retired)

Re: help with consolidating data please

Hi Maureen,

If you right-click the header of your original timestamp column and select New Formula Column->Transform->Round... you'll be presented with a dialog box that will allow you to round your date to whatever interval you choose after selecting up, down, or nearest for the rounding rule. This will give you a new formula column with the rounded dates that you can then use to align your data. 

 

Hope this helps.

-Jerry

View solution in original post

7 REPLIES 7
Highlighted
txnelson
Super User

Re: help with consolidating data please

Is there anyway to determine if the test is a Test A or Test B, etc?
Jim
Highlighted
Maureen
Level II

Re: help with consolidating data please

Each test result is currently in it's own column (this was a data dump that came this way).  An example of what I'd like to consolidate would look like this:

 

subject.    date.    A.    B.    C.    D.    E.

1.              1/2/10   .      .      100. 50. 200

1.              1/2/10  20   50     .      .      .

 

to become a single row with numbers in A, B, C, D, E

or, conversely a single column called result with another column for test name

 

Highlighted
Thierry_S
Level VI

Re: help with consolidating data please

Hi Maureen,

 

Assuming that your data is arranged in similar manner than in the attached mock table, you could reconcile your test data as follows:

1) Stack the A - E columns, keeping all other columns in the resulting table

2) Change the stacked data column to "Ordinal"

3) Select (using the filter tool) and delete all the empty rows

4) Split the "data" columns by "Label" and Group by PATIENT_ID and DATE

 

Of note, you will need to make sure that the assays performed multiple times on the same day have a distinct timestamp / Date value.

 

If this does not meet your need, ti would be helpful if you could share an anonymized or mock table showing how your data is currently arranged.

 

Best,

 

TS

Thierry R. Sornasse

View solution in original post

Highlighted
Maureen
Level II

Re: help with consolidating data please

this came so close!  But I used the short date function to truncate the times away from the timestamp.  The original timestamp was keeping the tests from panel 1 (A, B) separate from panel 2 (C, D, E) because of a few minutes difference.  Still, if I have to loose some data from it, duplicate single tests would be ok to drop, because, ultimately I will be choosing data that has all 5 tests present and dropping single test results.

 

If there is a way to round the original timestamp into say, two hour increments, then that would work too - I could keep any duplicates run half a day later, but still group tests within a couple of hours.  This would be for a version where I do not dump rows without all five tests.

Highlighted
jerry_cooper
Staff (Retired)

Re: help with consolidating data please

Hi Maureen,

If you right-click the header of your original timestamp column and select New Formula Column->Transform->Round... you'll be presented with a dialog box that will allow you to round your date to whatever interval you choose after selecting up, down, or nearest for the rounding rule. This will give you a new formula column with the rounded dates that you can then use to align your data. 

 

Hope this helps.

-Jerry

View solution in original post

Highlighted
Maureen
Level II

Re: help with consolidating data please

thank you!  I'm still learning formulas in JMP, but that trick means it creates the formula for me.  Nice to know.

Highlighted
dale_lehman
Level VI

Re: help with consolidating data please

A mocked up example would help since I think this will not quite do what you want - but it should get close.  Use the Tabulate platform with Subject and then date on the rows (after rounding the date as suggested) and put the tests in the columns, with the data in the table (perhaps means, but you might also want min and max).  I find Tabulate the easiest way to rearrange the data in ways you seem to be asking about.

Article Labels