cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
DTDummy99
Level I

How do I delete duplicate rows?

The deletion of duplicate rows is starting to frustrate me! I'm on version 18.

In the following script I have Sorted the table on SN (asc), Date & Time (desc). I then create a New Column "DuplicateFlag" and then apply the Formula If( Lag(:SN, 1) == :SN, 1, 0 ). This adds 0 for no duplicate and 1 for duplicate. We then Select the rows = to 1 and Delete the Selected Rows.

This appears to work great when the Date and Time are of different values for the associated SN. However, when the Date & Time are the same for the associated SN it does not delete the duplicate SN.

What suggestions to ensure that all duplicate SN's are successfully deleted.


// Sort the SIDE_1_PRE_MTG table by SN (ascending), Date (descending), and Time (descending)
dt_sort = dt_erroneousdelete << Sort(
    By( :SN, :Date, :Time ),
    Order( Ascending, Descending, Descending ),
    Replace Table
);

// Add a DuplicateFlag column to mark duplicate SN entries
dt_sort << New Column("DuplicateFlag",
    Numeric,
    "Continuous",
    Formula(
        If( Lag(:SN, 1) == :SN, 1, 0 )
    )
);

// Select rows where DuplicateFlag is 1 and delete them
dt_sort << Select Where( :DuplicateFlag == 1 );
dt_sort << Delete Rows; // Deletes the selected rows

// Remove the DuplicateFlag column after deleting duplicates
dt_sort << Delete Columns("DuplicateFlag");
4 REPLIES 4

Re: How do I delete duplicate rows?

I'm not sure I'm following exactly what you're looking for. Would something like this work, but modified for your data table?

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Select duplicate rows( Match( :age, :height ) );
dt << Delete rows ();
DTDummy99
Level I

Re: How do I delete duplicate rows?

I did try the simplistic suggestion you offered but it didn't work either.

The problem seems to when the 3 columns that I sort by.....
SN in Ascending order, Date in Descending order and Time in Descending order (this puts the most recent record by SN on top or first in-line), are all the same values. Per the attached screen shot you can see the SN, Date & Time are all the same values per associated SN (I color coded each matching set)..

jthi
Super User

Re: How do I delete duplicate rows?

If I understand correctly what you are trying to do, Select Duplicate Rows should work https://www.jmp.com/support/help/en/18.0/#page/jmp/select-rows.shtml#. Can you provide a dataset (mockup is most likely fine) where you mark which rows you wish to delete.

-Jarmo
txnelson
Super User

Re: How do I delete duplicate rows?

Are the Date and Time columns JMP numeric Date/Time columns?  If so, the displayed values for the Date column may display the same, but the actual numeric values(number of seconds since 01JAN1904) may be different.  Change the display format for the Date column to Fixed Dec, 20, 4 and see if there are differences. 

Jim