- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
find duplicates and drop by state
Hi,
I have a data table that contains duplicates.
I want to find the duplicates.
duplicates are rows with the same, Col, Row, Scan number, and ID values.
then, for duplicates with the same col, row, scan number, and ID values I want to keep only those with max abs offsetX (i only have the OffsetX values, need to run Abs formula )
each Col, Row, and Scan number should have 4 rows with ID 0-3.
my final table will include all unique results+ results after removing duplicates.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: find duplicates and drop by state
I'm not 100% sure if this will work or not as I didn't test it
- Create new column for Abs(OffsetX). You should be able to do this by selecting the column and right clicking
- Sort descending by the new column
- Select Col, Row, Scan and ID columns
- Go to Rows / Row Selection and Select Duplicate Rows
- You should have now the rows selected which you wish to delete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: find duplicates and drop by state
Here is a script that uses the Summary Platform to find the unique combinations of Col, Row, ID and Scan Number. It also uses a little trick to find the Maximum deviation from zero value for the OffssetX column. However, it is not finding a consistent number of rows for the unique combinations.
Names Default To Here( 1 );
dt = Data Table( "raw data-start" );
Data Table( "raw data-start" ) << Summary(
Group( :Col, :Row, :ID, :Scan Number ),
Min( :OffsetX ),
Max( Transform Column( "Abs OffsetX", Formula( Abs( :OffsetX ) ) ) ),
Freq( "None" ),
Weight( "None" ),
Link to Original Data Table( 0 ),
statistics column name format( "column" )
);
For Each Row(
If( :OffsetX > 0 & Abs( :OffsetX ) < :Abs OffsetX,
:OffsetX = :Abs OffsetX
)
);
// dt << delete columns( :Abs OffsetX );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: find duplicates and drop by state
I'm not 100% sure if this will work or not as I didn't test it
- Create new column for Abs(OffsetX). You should be able to do this by selecting the column and right clicking
- Sort descending by the new column
- Select Col, Row, Scan and ID columns
- Go to Rows / Row Selection and Select Duplicate Rows
- You should have now the rows selected which you wish to delete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: find duplicates and drop by state
Here is a script that uses the Summary Platform to find the unique combinations of Col, Row, ID and Scan Number. It also uses a little trick to find the Maximum deviation from zero value for the OffssetX column. However, it is not finding a consistent number of rows for the unique combinations.
Names Default To Here( 1 );
dt = Data Table( "raw data-start" );
Data Table( "raw data-start" ) << Summary(
Group( :Col, :Row, :ID, :Scan Number ),
Min( :OffsetX ),
Max( Transform Column( "Abs OffsetX", Formula( Abs( :OffsetX ) ) ) ),
Freq( "None" ),
Weight( "None" ),
Link to Original Data Table( 0 ),
statistics column name format( "column" )
);
For Each Row(
If( :OffsetX > 0 & Abs( :OffsetX ) < :Abs OffsetX,
:OffsetX = :Abs OffsetX
)
);
// dt << delete columns( :Abs OffsetX );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: find duplicates and drop by state
Thank you, this works fine!