cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
LarsBirger
Level III

How to eliminate the first duplicate row?

Dear community I have a question and that is how to eliminate rows with the first duplicate.

 

Example

 

NrDateCount
45672017-01-211
55552017-02-031
55552017-02-102
87452015-03-101
83452016-05-011
95632016-01-021
95632016-01-102

 

Is there a script how to, in this example, find the duplicates that I have marked here with bold text and either eliminate these rows directly or create an additional column where it could be stated for example Keep and Eliminate?

 

I am using JMP Pro 14.0.0 (64-bit)

 

Sincerely yours

 

 

Lars Enochsson, M.D., Ph.D.

Professor of Surgery

Department of Surgical and Perioperative Sciences

Umeå University

Head of the Swedish Registry of Gallstone Surgery and ERCP, GallRiks

Scientific Secretary of the Swedish Surgical Society

E-mail: lars.enochsson@umu.se

 

1 ACCEPTED SOLUTION

Accepted Solutions
ms
Super User (Alumni) ms
Super User (Alumni)

Re: How to eliminate the first duplicate row?

This script should work too:

 

// Delete all multiplicate rows except the one with the latest date (sorting does not matter). 
dt = Current Data Table();
dt << Select Where(Col Max(:Date, :Nr) > :Date);
dt << Delete rows;

Edit: If duplicates can have the same date, this works better, but sorting by date required.

dt << Select Where(Col Max(Row(), :Nr) > Row());
dt << Delete Rows;

 

View solution in original post

7 REPLIES 7
cwillden
Super User (Alumni)

Re: How to eliminate the first duplicate row?

Hi Lars,

Will duplicates always be sequentially ordered or could there be records between?  If they are always sequential like in your example, you could do something like this:

dt = Current Data Table();
del_rows = {}; //initiate list to contain list of rows that are duplicates

for(i = 1, i<=N Row(dt), i++,
	if(:Nr[i] == :Nr[i+1], insert into(del_rows,i)) //if Nr for current row is same as next, then put current row in del_rows
);

dt << Delete Rows(del_rows); //delete all rows in del_rows

 

-- Cameron Willden
txnelson
Super User

Re: How to eliminate the first duplicate row?

Here is a script that will get the job done:

Names Default To Here( 1 );
dt = New Table( "Example",
	Add Rows( 7 ),
	New Script(
		"Source",
		Data Table( "Transpose of Untitled 17" ) <<
		Subset( All rows, Selected columns only( 0 ) )
	),
	New Column( "Nr",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [4567, 5555, 5555, 8745, 8345, 9563, 9563] )
	),
	New Column( "Date",
		Numeric,
		"Continuous",
		Format( "yyyy-mm-dd", 12 ),
		Input Format( "yyyy-mm-dd" ),
		Set Values(
			[3567801600, 3568924800, 3569529600, 3508790400, 3544905600, 3534537600,
			3535228800]
		)
	),
	New Column( "Count",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [1, 1, 2, 1, 1, 1, 2] )
	)
);

dt << select duplicate rows( Match( :Nr ) );
wait(5);
dt2 = dt << subset( invisible, selected rows( 1 ), selected columns( 0 ) );
dt << delete rows;
Wait(5);
dt = dt << Update( With( dt2 ), Match Columns( :Nr = :Nr ) );

Close( dt2, nosave );
Jim
ms
Super User (Alumni) ms
Super User (Alumni)

Re: How to eliminate the first duplicate row?

This script should work too:

 

// Delete all multiplicate rows except the one with the latest date (sorting does not matter). 
dt = Current Data Table();
dt << Select Where(Col Max(:Date, :Nr) > :Date);
dt << Delete rows;

Edit: If duplicates can have the same date, this works better, but sorting by date required.

dt << Select Where(Col Max(Row(), :Nr) > Row());
dt << Delete Rows;

 

LarsBirger
Level III

Re: How to eliminate the first duplicate row?

Thanks for the quick reply. This script really did the trick. There are not som many people in Sweden using JMP at least not in the academic world. Usually they use SPSS or Stata. However, with this active community there is no reason to change. Even our statistician up here at Umeå University is impressed./Lars
txnelson
Super User

Re: How to eliminate the first duplicate row?

Well, I appreciate the "Thanks". I feel even better about responding to you, now that I know you are a fellow Scandinavian. My Great Grandfather, Nels Knutson, immigrated to America from Bergan, Norway.
Jim
ms
Super User (Alumni) ms
Super User (Alumni)

Re: How to eliminate the first duplicate row?

I actually first encountered JMP in 1994 at Uppsala university, Sweden and have used it ever since. Even if JMP is still not very commonly used in Swedish academia, I have "converted" quite a few SPSS users along the way.

 

Agree, this community is great.

txnelson
Super User

Re: How to eliminate the first duplicate row?

@ms ,

Your years of experiece and knowlege with JMP are obvious in your Community Discussion responses.  I really appreciate your involvement in the Community.

 

Jim

Jim