cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
psundar6
Level III

Remove Duplicate rows using Timestamp

Hello,

 

I have a sample data table attached wherein I am looking to remove duplicate rows within the data set using the TimeStamp,Tool columns.

Criteria for duplication:

Timestamp that are within 6mins on the same tool are considered as duplicates and those rows have to be removed.

Can someone help how this can be achieved in JSL?

 

Thanks!

 

@txnelson 

2 ACCEPTED SOLUTIONS

Accepted Solutions
txnelson
Super User

Re: Remove Duplicate rows using Timestamp

I believe this will give you an example on how to proceed with your issue

Names Default To Here( 1 );
dt = Current Data Table();

// Loop through the data table deleting all duplicate rows
i = 1;
While( i <= N Rows( dt ),
	
	// Find the upper and lower bounds to match on
	lower = :timeStamp[i] - In Minutes( 6 );
	upper = :timeStamp[i] + In Minutes( 6 );
	
	// Find the rows that match
	theRows = dt << get rows where( :timeStamp > lower & :timeStamp < upper );
	
	// Remove the current row from the list.  We do not want to delete it
	Try( theRows[1] = [] );
	
	// Delete the remaining rows
	Try( dt << delete rows( theRows ) );
	
	// Increment the row to work on
	i++;
);
Jim

View solution in original post

ron_horne
Super User (Alumni)

Re: Remove Duplicate rows using Timestamp

Hi @psundar6 ,

@txnelson provides a very good solution.

An alternative approach using table sort and column formula could be done interactively or by scripting as follows:

dt = current data table();

dt << Sort(
	By( :TimeStamp, :Tool ),
	Replace Table,
	Order( Ascending, Ascending )
);

dt << New Column( "to delete",
	Numeric,
	"Ordinal",
	Format( "Best", 12 ),
	Formula( If( :Tool == Lag( :Tool, 1 ) & Date Difference( Lag( :TimeStamp, 1 ), :TimeStamp, "Minute" ) < 6, 1, 0 ), eval formula )
);
dt:to delete << suppress eval(true);

dt<< delete rows (dt<< get rows where (:to delete == 1));

let us know if it works for you.

View solution in original post

5 REPLIES 5
txnelson
Super User

Re: Remove Duplicate rows using Timestamp

I believe this will give you an example on how to proceed with your issue

Names Default To Here( 1 );
dt = Current Data Table();

// Loop through the data table deleting all duplicate rows
i = 1;
While( i <= N Rows( dt ),
	
	// Find the upper and lower bounds to match on
	lower = :timeStamp[i] - In Minutes( 6 );
	upper = :timeStamp[i] + In Minutes( 6 );
	
	// Find the rows that match
	theRows = dt << get rows where( :timeStamp > lower & :timeStamp < upper );
	
	// Remove the current row from the list.  We do not want to delete it
	Try( theRows[1] = [] );
	
	// Delete the remaining rows
	Try( dt << delete rows( theRows ) );
	
	// Increment the row to work on
	i++;
);
Jim
psundar6
Level III

Re: Remove Duplicate rows using Timestamp

This works, Thank you!

ron_horne
Super User (Alumni)

Re: Remove Duplicate rows using Timestamp

Hi @psundar6 ,

@txnelson provides a very good solution.

An alternative approach using table sort and column formula could be done interactively or by scripting as follows:

dt = current data table();

dt << Sort(
	By( :TimeStamp, :Tool ),
	Replace Table,
	Order( Ascending, Ascending )
);

dt << New Column( "to delete",
	Numeric,
	"Ordinal",
	Format( "Best", 12 ),
	Formula( If( :Tool == Lag( :Tool, 1 ) & Date Difference( Lag( :TimeStamp, 1 ), :TimeStamp, "Minute" ) < 6, 1, 0 ), eval formula )
);
dt:to delete << suppress eval(true);

dt<< delete rows (dt<< get rows where (:to delete == 1));

let us know if it works for you.

psundar6
Level III

Re: Remove Duplicate rows using Timestamp

Hello Ron,

I tried your solution and I see an issue with the result.

I have attached the result from running your code. It seems like it is not catching a couple of duplicate entries for some reason.

Thanks. 

ron_horne
Super User (Alumni)

Re: Remove Duplicate rows using Timestamp

Thanks @psundar6 for letting me know. there was a typo now I have edited the script..