BookmarkSubscribeRSS Feed
gail_massari

Community Manager

Joined:

Feb 27, 2013

Is it possible to find and get rid of duplicate rows easily?

Olivia Lippincott shares a tip from Byron Wingerd for an easy way to find and get rid of duplicate rows. Join the table to itself (Tables>Join), match all columns, drop multiples for the Main and With Tables, and then save the joined table with a new name.  (Saving with a new name is most always better than renaming or changing in place because it allows you to examine the changes and assure they reflect what you really wanted to do.)

(view in My Videos)

Comments
WHTseng

Easy, efficient, and clear.

Thank you, gail_massari.

vhuac

Hi Gail,

 

This method is nice but may not work for a large data table with more than 50 millions rows or more than 10 GB in file size.  The reason is that it create an additional instance of the data table and hence use up all the memory and cause the computer to be very slow or even hang and crash.

 

Is there a more efficient way to remove duplicate that does not create an additional table and does not take too long?

Article Labels
Article Tags
Contributors