Here's one approach.

1) Create a data set that has all 12,000 patients, both cases and potential controls. Include a column that indicates group membership. Also include continuous variables that you wish to match on. (Not sure if comorbidities can be used this way -- perhaps you have some way of converting to a continuous score).

2) Run a hierarchical cluster analysis on the matching variables. Use a unique identifier for the patient as "Label." Check the "Standardize Data" checkbox to ensure that your matching variables contribute equally to the analysis, otherwise variables with greater variance will have greater influence.

3) From the red triangle hotspot, choose "Save Distance Matrix". The distance matrix can be used to find a match for each case. You'll need to join the membership variable back onto the distance matrix, then delete the rows for the cases. Now for each column that represents a case, the row with the smallest value in that column is the closest match among the potential controls.

One pitfall is that a single control might be chosen for multiple cases -- you'll have to check for this manually or via scripting.

Good luck,

Jordan