Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- JMP User Community
- :
- Discussions
- :
- help with consolidating data please

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 16, 2020 11:14 AM
(651 views)

Hi all

I am what I consider an intermediate to advanced Excel user, but only a moderate JMP user. I'm trying more and more to learn the formulas in JMP, but I still do a lot of data massage in Excel. I have an issue I'd like to use JMP to solve if I could get some help with the formulas or table options needed.

I got a huge data dump of many years of laboratory results from our hospital archive - it's over a million rows, so right there is why I can't use Excel. I am interested in five different tests - you can call them A, B, C, D, E. I have been given a subject number and some demographics, (age, sex race) and the date and time of the results. The problem is that tests A and B are in one panel, and C, D, and E are in a second panel, or even ordered individually. So there might be two rows of data one with A and B and the second row with C, D, and E. Because they are the same date, but a few minutes apart it has them separate. I have figured out the short date formula which strips out the time. Now what I'd like to do is to consolidate all five tests into a single row if they are from the same date on the same subject. One thing that can potentially confuse the issue is that occasionally a single test, say, C is run more than once in a day. So, ideally, the five tests that are closest together in time would end up on one row together and any repeated values later in that same day would still get a row (so not tossing any data that happens to duplicate on a day)

Can anyone help me understand how this would work in JMP?

thanks, Maureen

2 ACCEPTED SOLUTIONS

Accepted Solutions

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi Maureen,

Assuming that your data is arranged in similar manner than in the attached mock table, you could reconcile your test data as follows:

1) Stack the A - E columns, keeping all other columns in the resulting table

2) Change the stacked data column to "Ordinal"

3) Select (using the filter tool) and delete all the empty rows

4) Split the "data" columns by "Label" and Group by PATIENT_ID and DATE

Of note, you will need to make sure that the assays performed multiple times on the same day have a distinct timestamp / Date value.

If this does not meet your need, ti would be helpful if you could share an anonymized or mock table showing how your data is currently arranged.

Best,

TS

Thierry R. Sornasse

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi Maureen,

If you right-click the header of your original timestamp column and select New Formula Column->Transform->Round... you'll be presented with a dialog box that will allow you to round your date to whatever interval you choose after selecting up, down, or nearest for the rounding rule. This will give you a new formula column with the rounded dates that you can then use to align your data.

Hope this helps.

-Jerry

7 REPLIES 7

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: help with consolidating data please

Is there anyway to determine if the test is a Test A or Test B, etc?

Jim

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: help with consolidating data please

Each test result is currently in it's own column (this was a data dump that came this way). An example of what I'd like to consolidate would look like this:

subject. date. A. B. C. D. E.

1. 1/2/10 . . 100. 50. 200

1. 1/2/10 20 50 . . .

to become a single row with numbers in A, B, C, D, E

or, conversely a single column called result with another column for test name

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi Maureen,

Assuming that your data is arranged in similar manner than in the attached mock table, you could reconcile your test data as follows:

1) Stack the A - E columns, keeping all other columns in the resulting table

2) Change the stacked data column to "Ordinal"

3) Select (using the filter tool) and delete all the empty rows

4) Split the "data" columns by "Label" and Group by PATIENT_ID and DATE

Of note, you will need to make sure that the assays performed multiple times on the same day have a distinct timestamp / Date value.

If this does not meet your need, ti would be helpful if you could share an anonymized or mock table showing how your data is currently arranged.

Best,

TS

Thierry R. Sornasse

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: help with consolidating data please

this came so close! But I used the short date function to truncate the times away from the timestamp. The original timestamp was keeping the tests from panel 1 (A, B) separate from panel 2 (C, D, E) because of a few minutes difference. Still, if I have to loose some data from it, duplicate single tests would be ok to drop, because, ultimately I will be choosing data that has all 5 tests present and dropping single test results.

If there is a way to round the original timestamp into say, two hour increments, then that would work too - I could keep any duplicates run half a day later, but still group tests within a couple of hours. This would be for a version where I do not dump rows without all five tests.

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi Maureen,

If you right-click the header of your original timestamp column and select New Formula Column->Transform->Round... you'll be presented with a dialog box that will allow you to round your date to whatever interval you choose after selecting up, down, or nearest for the rounding rule. This will give you a new formula column with the rounded dates that you can then use to align your data.

Hope this helps.

-Jerry

Highlighted
##

thank you! I'm still learning formulas in JMP, but that trick means it creates the formula for me. Nice to know.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: help with consolidating data please

Highlighted
##

A mocked up example would help since I think this will not quite do what you want - but it should get close. Use the Tabulate platform with Subject and then date on the rows (after rounding the date as suggested) and put the tests in the columns, with the data in the table (perhaps means, but you might also want min and max). I find Tabulate the easiest way to rearrange the data in ways you seem to be asking about.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: help with consolidating data please