cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
mann
Level III

Exact match and Nearest Match for updating a table with another table

I have a big data table (1M rows).  One column is serialNumber, one column is dateTime, one column is elapsedTime.  Another much smaller table also has these columns, but the dateTime is earlier by an small amount (5-20 minutes) and the elapsedTime is a much more accurate value.  The small table also has lots of extra rows associated with other events at other dateTimes (I do not have an 'event' column).  I want to update the big table with the accurate elapsedTimes by matching the serialNumbers and selecting the largest dateTime less than or equal to the big table's dateTime.  

This will be part of a bigger script that is run daily, updating each serialNumber each day.  

Right now, my best solution is to compromise on the accuracy and just select the average elapsedTime for the entire day, and using an exact match for the dateTime column.  This will introduce an inaccuracy when the small table includes multiple events at multiple times within a given day, but "within 12 hours" is still much more accurate than the big table is presently.

There is another solutions using a pair of nested foreachrows loops, but that will be very slow.

 

If you have an idea, please just suggest it, don't worry about writing the code (yet).

 

Thanks, 

 

Jonathan

3 REPLIES 3
txnelson
Super User

Re: Exact match and Nearest Match for updating a table with another table

I would suggest that you use something like;

theRows = dt2 << select rows where(  dt2:serialNumber == dt21:serialNumber[i] &
dt2:dateTime <= dt1:dateTime );

if( N Rows( theRows ) > 0, targetRow = theRows[N Rows( theRows )] );

 

Jim
mann
Level III

Re: Exact match and Nearest Match for updating a table with another table

TX, I am amazed by your speed and comprehension.  Your answer subtly fails to mention a slow for each SN loop. 

 

As I was finishing my post I had the beginings of this idea:

 

Subset the big table, selecting only the date window of the small table.

Concatenate the subset table onto the small table excluding the elapsedTime column.

Sort the concatenated table by SN & dateTime

Fill in the missing elapsedTimes with the previous row.

Update the big table with the concatenated table.

txnelson
Super User

Re: Exact match and Nearest Match for updating a table with another table

That is a method that I have used in the past, and it will run faster. But, look out for records that do not have any of the interleaved records following or preceeding them.

Jim