Discussions

shahram · Dec 13, 2014 02:59 AM

I am conducting a matched cohort study and would like to use JMP to create my control cohort. I have approximately 2000 cases and 10000 potential controls. I would like to match the cases to controls in a 1 to 1 ratio. How can I use JMP to select from the 10000 potential controls by matching those who are the most similar to the cases on three factors - date of admission to the hospital, comorbidities, and injury score?

Thank you for any guidance you can provide.

Jordan_Hiller · Dec 13, 2014 07:49 PM

Here's one approach.

1) Create a data set that has all 12,000 patients, both cases and potential controls. Include a column that indicates group membership. Also include continuous variables that you wish to match on. (Not sure if comorbidities can be used this way -- perhaps you have some way of converting to a continuous score).

2) Run a hierarchical cluster analysis on the matching variables. Use a unique identifier for the patient as "Label." Check the "Standardize Data" checkbox to ensure that your matching variables contribute equally to the analysis, otherwise variables with greater variance will have greater influence.

3) From the red triangle hotspot, choose "Save Distance Matrix". The distance matrix can be used to find a match for each case. You'll need to join the membership variable back onto the distance matrix, then delete the rows for the cases. Now for each column that represents a case, the row with the smallest value in that column is the closest match among the potential controls.

One pitfall is that a single control might be chosen for multiple cases -- you'll have to check for this manually or via scripting.

Good luck,

Jordan

View solution in original post

Jordan_Hiller · Dec 13, 2014 07:49 PM

Here's one approach.

1) Create a data set that has all 12,000 patients, both cases and potential controls. Include a column that indicates group membership. Also include continuous variables that you wish to match on. (Not sure if comorbidities can be used this way -- perhaps you have some way of converting to a continuous score).

2) Run a hierarchical cluster analysis on the matching variables. Use a unique identifier for the patient as "Label." Check the "Standardize Data" checkbox to ensure that your matching variables contribute equally to the analysis, otherwise variables with greater variance will have greater influence.

3) From the red triangle hotspot, choose "Save Distance Matrix". The distance matrix can be used to find a match for each case. You'll need to join the membership variable back onto the distance matrix, then delete the rows for the cases. Now for each column that represents a case, the row with the smallest value in that column is the closest match among the potential controls.

One pitfall is that a single control might be chosen for multiple cases -- you'll have to check for this manually or via scripting.

Good luck,

Jordan

shahram · Dec 15, 2014 01:00 AM

Great, thanks Jordan.

Best,

Shahram

On Sat, Dec 13, 2014 at 4:50 PM, jordanhiller@jmp <[email protected]>

shahram · Dec 16, 2014 05:20 PM

Jordan,

I ran into a little glitch - when I tried the cluster analysis, it runs the

initial analysis but when I try to 'Save Distance Matrix' I get a message

that says 'Unable to allocate enough memory'.

My table is larger than I expected - closer to 30,000 rows. Perhaps this

is the problem?

Thanks,

Shahram

ABModelingFish6 · Jul 27, 2023 12:50 PM

Would you be able to attach screenshots to this post. I got a little confuse while following the instructions.

Thank you very much

Discussions

How to create a matched cohort?

Re: How to create a matched cohort?

Re: How to create a matched cohort?

Re: How to create a matched cohort?

Re: How to create a matched cohort?

Re: How to create a matched cohort?

Recommended Articles