cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Ohad_s
Level III

find duplicates and drop by state

Hi,

I have a data table that contains duplicates.

I want to find the duplicates.

duplicates are rows with the same, Col, Row, Scan number, and ID values.

then, for duplicates with the same col, row, scan number, and ID values I want to keep only those with max abs offsetX (i only have the OffsetX values, need to run Abs formula )

each Col, Row, and Scan number should have 4 rows with ID 0-3.

my final table will include all unique results+ results after removing duplicates.

 

2 ACCEPTED SOLUTIONS

Accepted Solutions
jthi
Super User

Re: find duplicates and drop by state

I'm not 100% sure if this will work or not as I didn't test it

  1. Create new column for Abs(OffsetX). You should be able to do this by selecting the column and right clicking
  2. Sort descending by the new column
  3. Select Col, Row, Scan and ID columns
  4. Go to Rows / Row Selection and Select Duplicate Rows
  5. You should have now the rows selected which you wish to delete
-Jarmo

View solution in original post

txnelson
Super User

Re: find duplicates and drop by state

Here is a script that uses the Summary Platform to find the unique combinations of Col, Row, ID and Scan Number.  It also uses a little trick to find the Maximum deviation from zero value for the OffssetX column.  However, it is not finding a consistent number of rows for the unique combinations.

 

txnelson_0-1690138874973.png

Names Default To Here( 1 );
dt = Data Table( "raw data-start" );
Data Table( "raw data-start" ) << Summary(
	Group( :Col, :Row, :ID, :Scan Number ),
	Min( :OffsetX ),
	Max( Transform Column( "Abs OffsetX", Formula( Abs( :OffsetX ) ) ) ),
	Freq( "None" ),
	Weight( "None" ),
	Link to Original Data Table( 0 ),
	statistics column name format( "column" )
);

For Each Row(
	If( :OffsetX > 0 & Abs( :OffsetX ) < :Abs OffsetX,
		:OffsetX = :Abs OffsetX
	)
);

// dt << delete columns( :Abs OffsetX );
Jim

View solution in original post

3 REPLIES 3
jthi
Super User

Re: find duplicates and drop by state

I'm not 100% sure if this will work or not as I didn't test it

  1. Create new column for Abs(OffsetX). You should be able to do this by selecting the column and right clicking
  2. Sort descending by the new column
  3. Select Col, Row, Scan and ID columns
  4. Go to Rows / Row Selection and Select Duplicate Rows
  5. You should have now the rows selected which you wish to delete
-Jarmo
txnelson
Super User

Re: find duplicates and drop by state

Here is a script that uses the Summary Platform to find the unique combinations of Col, Row, ID and Scan Number.  It also uses a little trick to find the Maximum deviation from zero value for the OffssetX column.  However, it is not finding a consistent number of rows for the unique combinations.

 

txnelson_0-1690138874973.png

Names Default To Here( 1 );
dt = Data Table( "raw data-start" );
Data Table( "raw data-start" ) << Summary(
	Group( :Col, :Row, :ID, :Scan Number ),
	Min( :OffsetX ),
	Max( Transform Column( "Abs OffsetX", Formula( Abs( :OffsetX ) ) ) ),
	Freq( "None" ),
	Weight( "None" ),
	Link to Original Data Table( 0 ),
	statistics column name format( "column" )
);

For Each Row(
	If( :OffsetX > 0 & Abs( :OffsetX ) < :Abs OffsetX,
		:OffsetX = :Abs OffsetX
	)
);

// dt << delete columns( :Abs OffsetX );
Jim
Ohad_s
Level III

Re: find duplicates and drop by state

Thank you, this works fine!