Subscribe Bookmark RSS Feed

How to create a matched cohort?

shahram

Community Trekker

Joined:

Nov 29, 2014

I am conducting a matched cohort study and would like to use JMP to create my control cohort.  I have approximately 2000 cases and 10000 potential controls.  I would like to match the cases to controls in a 1 to 1 ratio.  How can I use JMP to select from the 10000 potential controls by matching those who are the most similar to the cases on three factors - date of admission to the hospital, comorbidities, and injury score?

Thank you for any guidance you can provide.

3 REPLIES
Jordan_Hiller

Joined:

Jun 23, 2011

Here's one approach.

1) Create a data set that has all 12,000 patients, both cases and potential controls. Include a column that indicates group membership. Also include continuous variables that you wish to match on. (Not sure if comorbidities can be used this way -- perhaps you have some way of converting to a continuous score).

2) Run a hierarchical cluster analysis on the matching variables. Use a unique identifier for the patient as "Label." Check the "Standardize Data" checkbox to ensure that your matching variables contribute equally to the analysis, otherwise variables with greater variance will have greater influence.

3) From the red triangle hotspot, choose "Save Distance Matrix". The distance matrix can be used to find a match for each case. You'll need to join the membership variable back onto the distance matrix, then delete the rows for the cases. Now for each column that represents a case, the row with the smallest value in that column is the closest match among the potential controls.

One pitfall is that a single control might be chosen for multiple cases -- you'll have to check for this manually or via scripting.

Good luck,

Jordan

shahram

Community Trekker

Joined:

Nov 29, 2014

Great, thanks Jordan.

Best,

Shahram

On Sat, Dec 13, 2014 at 4:50 PM, jordanhiller@jmp <jmpcommadmin@sas.com>

shahram

Community Trekker

Joined:

Nov 29, 2014

Jordan,

I ran into a little glitch - when I tried the cluster analysis, it runs the

initial analysis but when I try to 'Save Distance Matrix' I get a message

that says 'Unable to allocate enough memory'.

My table is larger than I expected - closer to 30,000 rows.  Perhaps this

is the problem?

Thanks,

Shahram