turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- How to create a matched cohort?

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 12, 2014 11:59 PM
(2901 views)

I am conducting a matched cohort study and would like to use JMP to create my control cohort. I have approximately 2000 cases and 10000 potential controls. I would like to match the cases to controls in a 1 to 1 ratio. How can I use JMP to select from the 10000 potential controls by matching those who are the most similar to the cases on three factors - date of admission to the hospital, comorbidities, and injury score?

Thank you for any guidance you can provide.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 13, 2014 4:49 PM
(3415 views)

Solution

Here's one approach.

1) Create a data set that has all 12,000 patients, both cases and potential controls. Include a column that indicates group membership. Also include continuous variables that you wish to match on. (Not sure if comorbidities can be used this way -- perhaps you have some way of converting to a continuous score).

2) Run a hierarchical cluster analysis on the matching variables. Use a unique identifier for the patient as "Label." Check the "Standardize Data" checkbox to ensure that your matching variables contribute equally to the analysis, otherwise variables with greater variance will have greater influence.

3) From the red triangle hotspot, choose "Save Distance Matrix". The distance matrix can be used to find a match for each case. You'll need to join the membership variable back onto the distance matrix, then delete the rows for the cases. Now for each column that represents a case, the row with the smallest value in that column is the closest match among the potential controls.

One pitfall is that a single control might be chosen for multiple cases -- you'll have to check for this manually or via scripting.

Good luck,

Jordan

3 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 13, 2014 4:49 PM
(3416 views)

Here's one approach.

1) Create a data set that has all 12,000 patients, both cases and potential controls. Include a column that indicates group membership. Also include continuous variables that you wish to match on. (Not sure if comorbidities can be used this way -- perhaps you have some way of converting to a continuous score).

2) Run a hierarchical cluster analysis on the matching variables. Use a unique identifier for the patient as "Label." Check the "Standardize Data" checkbox to ensure that your matching variables contribute equally to the analysis, otherwise variables with greater variance will have greater influence.

3) From the red triangle hotspot, choose "Save Distance Matrix". The distance matrix can be used to find a match for each case. You'll need to join the membership variable back onto the distance matrix, then delete the rows for the cases. Now for each column that represents a case, the row with the smallest value in that column is the closest match among the potential controls.

One pitfall is that a single control might be chosen for multiple cases -- you'll have to check for this manually or via scripting.

Good luck,

Jordan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 14, 2014 10:00 PM
(2463 views)

Great, thanks Jordan.

Best,

Shahram

On Sat, Dec 13, 2014 at 4:50 PM, jordanhiller@jmp <jmpcommadmin@sas.com>

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 16, 2014 2:20 PM
(2463 views)

Jordan,

I ran into a little glitch - when I tried the cluster analysis, it runs the

initial analysis but when I try to 'Save Distance Matrix' I get a message

that says 'Unable to allocate enough memory'.

My table is larger than I expected - closer to 30,000 rows. Perhaps this

is the problem?

Thanks,

Shahram