- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Remove Duplicate rows using Timestamp
Hello,
I have a sample data table attached wherein I am looking to remove duplicate rows within the data set using the TimeStamp,Tool columns.
Criteria for duplication:
Timestamp that are within 6mins on the same tool are considered as duplicates and those rows have to be removed.
Can someone help how this can be achieved in JSL?
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
I believe this will give you an example on how to proceed with your issue
Names Default To Here( 1 );
dt = Current Data Table();
// Loop through the data table deleting all duplicate rows
i = 1;
While( i <= N Rows( dt ),
// Find the upper and lower bounds to match on
lower = :timeStamp[i] - In Minutes( 6 );
upper = :timeStamp[i] + In Minutes( 6 );
// Find the rows that match
theRows = dt << get rows where( :timeStamp > lower & :timeStamp < upper );
// Remove the current row from the list. We do not want to delete it
Try( theRows[1] = [] );
// Delete the remaining rows
Try( dt << delete rows( theRows ) );
// Increment the row to work on
i++;
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
Hi @psundar6 ,
@txnelson provides a very good solution.
An alternative approach using table sort and column formula could be done interactively or by scripting as follows:
dt = current data table();
dt << Sort(
By( :TimeStamp, :Tool ),
Replace Table,
Order( Ascending, Ascending )
);
dt << New Column( "to delete",
Numeric,
"Ordinal",
Format( "Best", 12 ),
Formula( If( :Tool == Lag( :Tool, 1 ) & Date Difference( Lag( :TimeStamp, 1 ), :TimeStamp, "Minute" ) < 6, 1, 0 ), eval formula )
);
dt:to delete << suppress eval(true);
dt<< delete rows (dt<< get rows where (:to delete == 1));
let us know if it works for you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
I believe this will give you an example on how to proceed with your issue
Names Default To Here( 1 );
dt = Current Data Table();
// Loop through the data table deleting all duplicate rows
i = 1;
While( i <= N Rows( dt ),
// Find the upper and lower bounds to match on
lower = :timeStamp[i] - In Minutes( 6 );
upper = :timeStamp[i] + In Minutes( 6 );
// Find the rows that match
theRows = dt << get rows where( :timeStamp > lower & :timeStamp < upper );
// Remove the current row from the list. We do not want to delete it
Try( theRows[1] = [] );
// Delete the remaining rows
Try( dt << delete rows( theRows ) );
// Increment the row to work on
i++;
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
This works, Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
Hi @psundar6 ,
@txnelson provides a very good solution.
An alternative approach using table sort and column formula could be done interactively or by scripting as follows:
dt = current data table();
dt << Sort(
By( :TimeStamp, :Tool ),
Replace Table,
Order( Ascending, Ascending )
);
dt << New Column( "to delete",
Numeric,
"Ordinal",
Format( "Best", 12 ),
Formula( If( :Tool == Lag( :Tool, 1 ) & Date Difference( Lag( :TimeStamp, 1 ), :TimeStamp, "Minute" ) < 6, 1, 0 ), eval formula )
);
dt:to delete << suppress eval(true);
dt<< delete rows (dt<< get rows where (:to delete == 1));
let us know if it works for you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
Hello Ron,
I tried your solution and I see an issue with the result.
I have attached the result from running your code. It seems like it is not catching a couple of duplicate entries for some reason.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Remove Duplicate rows using Timestamp
Thanks @psundar6 for letting me know. there was a typo now I have edited the script..