Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Highlighted
Level II

How to eliminate the first duplicate row?

Dear community I have a question and that is how to eliminate rows with the first duplicate.

Example

 Nr Date Count 4567 2017-01-21 1 5555 2017-02-03 1 5555 2017-02-10 2 8745 2015-03-10 1 8345 2016-05-01 1 9563 2016-01-02 1 9563 2016-01-10 2

Is there a script how to, in this example, find the duplicates that I have marked here with bold text and either eliminate these rows directly or create an additional column where it could be stated for example Keep and Eliminate?

I am using JMP Pro 14.0.0 (64-bit)

Sincerely yours

Lars Enochsson, M.D., Ph.D.

Professor of Surgery

Department of Surgical and Perioperative Sciences

Umeå University

Head of the Swedish Registry of Gallstone Surgery and ERCP, GallRiks

Scientific Secretary of the Swedish Surgical Society

E-mail: lars.enochsson@umu.se

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Super User

Re: How to eliminate the first duplicate row?

This script should work too:

``````// Delete all multiplicate rows except the one with the latest date (sorting does not matter).
dt = Current Data Table();
dt << Select Where(Col Max(:Date, :Nr) > :Date);
dt << Delete rows;``````

Edit: If duplicates can have the same date, this works better, but sorting by date required.

``````dt << Select Where(Col Max(Row(), :Nr) > Row());
dt << Delete Rows;
``````

7 REPLIES 7
Highlighted
Super User

Re: How to eliminate the first duplicate row?

Hi Lars,

Will duplicates always be sequentially ordered or could there be records between?  If they are always sequential like in your example, you could do something like this:

``````dt = Current Data Table();
del_rows = {}; //initiate list to contain list of rows that are duplicates

for(i = 1, i<=N Row(dt), i++,
if(:Nr[i] == :Nr[i+1], insert into(del_rows,i)) //if Nr for current row is same as next, then put current row in del_rows
);

dt << Delete Rows(del_rows); //delete all rows in del_rows``````

-- Cameron Willden
Highlighted
Super User

Re: How to eliminate the first duplicate row?

Here is a script that will get the job done:

``````Names Default To Here( 1 );
dt = New Table( "Example",
New Script(
"Source",
Data Table( "Transpose of Untitled 17" ) <<
Subset( All rows, Selected columns only( 0 ) )
),
New Column( "Nr",
Numeric,
"Continuous",
Format( "Best", 12 ),
Set Values( [4567, 5555, 5555, 8745, 8345, 9563, 9563] )
),
New Column( "Date",
Numeric,
"Continuous",
Format( "yyyy-mm-dd", 12 ),
Input Format( "yyyy-mm-dd" ),
Set Values(
[3567801600, 3568924800, 3569529600, 3508790400, 3544905600, 3534537600,
3535228800]
)
),
New Column( "Count",
Numeric,
"Continuous",
Format( "Best", 12 ),
Set Values( [1, 1, 2, 1, 1, 1, 2] )
)
);

dt << select duplicate rows( Match( :Nr ) );
wait(5);
dt2 = dt << subset( invisible, selected rows( 1 ), selected columns( 0 ) );
dt << delete rows;
Wait(5);
dt = dt << Update( With( dt2 ), Match Columns( :Nr = :Nr ) );

Close( dt2, nosave );``````
Jim
Highlighted
Super User

Re: How to eliminate the first duplicate row?

This script should work too:

``````// Delete all multiplicate rows except the one with the latest date (sorting does not matter).
dt = Current Data Table();
dt << Select Where(Col Max(:Date, :Nr) > :Date);
dt << Delete rows;``````

Edit: If duplicates can have the same date, this works better, but sorting by date required.

``````dt << Select Where(Col Max(Row(), :Nr) > Row());
dt << Delete Rows;
``````

Highlighted
Level II

Re: How to eliminate the first duplicate row?

Thanks for the quick reply. This script really did the trick. There are not som many people in Sweden using JMP at least not in the academic world. Usually they use SPSS or Stata. However, with this active community there is no reason to change. Even our statistician up here at Umeå University is impressed./Lars
Super User

Re: How to eliminate the first duplicate row?

Well, I appreciate the "Thanks". I feel even better about responding to you, now that I know you are a fellow Scandinavian. My Great Grandfather, Nels Knutson, immigrated to America from Bergan, Norway.
Jim
Highlighted
Super User

Re: How to eliminate the first duplicate row?

I actually first encountered JMP in 1994 at Uppsala university, Sweden and have used it ever since. Even if JMP is still not very commonly used in Swedish academia, I have "converted" quite a few SPSS users along the way.

Agree, this community is great.

Highlighted
Super User

Re: How to eliminate the first duplicate row?

@ms ,

Your years of experiece and knowlege with JMP are obvious in your Community Discussion responses.  I really appreciate your involvement in the Community.

Jim

Jim
Article Labels

There are no labels assigned to this post.