BookmarkSubscribe
Choose Language Hide Translation Bar
Highlighted
sagrim
Community Member

Joining Two Datasets Using Closest Date Match

Hello all.

 

I am attempting to join two datasets in JMP using a unique identifier plus a closest date match.  I have some code here that runs and is pretty close to what I need but I am having trouble with it keeping many to one matches.  One unique identifier "EMP_ID_UID" could (or could not) have several "Transaction Dates" and multiple "MED_SCAN_DTIMES" will match one "Transaction Date" (because Transaction Date is a date that someone clocked in for their shift and naturally a busy provider will likely administer more than one medication during a 12 hour shift).  So what I need is for the join to FIRST match on "EMP_ID_UID" and THEN match the "MED_SCAN_DTIME" to the most recent "Transacation Date" and for it to allow many "MED_SCAN_DTIME" to match a single "Transaction Date."  Right now, the code below matches the EMP_ID_UID and it does allow many transaction dates to match one EMP_ID_UID.  It also matches the closest dates BUT it will only allow one to one match of the dates.  It keeps the closest match only.  Is there a modification I can make on the join code that produce what I need?  Any thoughts or assistance will be greatly appreciated !   Thank you! 

 

SELECT *
FROM "Random Sample Trial" t1
LEFT OUTER JOIN "Subset of Relevent Unit Org Codes ONLY" t2
ON t1.EMP_ID_UID = t2.EMP_ID_UID AND t1.MED_SCAN_DTIME>t2.TransactionDate
WHERE t1.MED_SCAN_DTIME > t2.TransactionDate
GROUP BY t1.EMP_ID_UID,t1.MED_SCAN_DTIME
HAVING abs(t1.MED_SCAN_DTIME-t2.TransactionDate)=min(abs(t1.MED_SCAN_DTIME-t2.TransactionDate))
ORDER BY t1.EMP_ID_UID
;
quit;

 

I used this for a jumping off point on the code:

https://communities.sas.com/t5/SAS-Data-Management/Merging-two-tables-by-choosing-the-CLOSEST-dates/...

0 Kudos
1 REPLY 1
Byron_JMP
Staff

Re: Joining Two Datasets Using Closest Date Match

Just checking the details. 

It sounds like you have:

one table of clock in and out times for all the providers where the ID is the unique identifier

another table with ID's and SCAN times.

 

In the end It looks like you want to be able to see which (or how many) SCAN times happened during each clock in-out interval for each ID.

 

Is that close?

JMP Systems Engineer, Pharm and BioPharm Sciences
0 Kudos