Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Weeding out duplicates 1

Status
Not open for further replies.

mbaranski

Programmer
Nov 3, 2000
421
US
I have a table in mysql that has about 60 columns. It has about 50,000 entries in it. For each entry, there are 3 almost duplicate records. By almost duplicate I mean that 57 out of the 60 columns are the same for the 3 rows. I want to be able (with temporary tables and select statements) to only have 1 row for each "almost duplicate" entry. That is, a record is considered a duplicate if and only if all of the fields except the access_date, source_host, and access_time are the same. Also, I want to keep the record (1 out of 3) with the largest access_date.

I am not so much an sql person, and would greatly appreciate any help you can give. I would think that this is doable fairly easily, I just don't know how and need it more quickly than I can learn it

Thanks!
Mike B. As always, I hope that helped!

Disclaimer:
Beware: Studies have shown that research causes cancer in lab rats.
 
I don't know mysql, but I know 2 ways of deleting duplicate rows. First (in Oracle), if you have allowed subqueries in mysql:
DELETE FROM mytable a
WHERE
EXISTS(SELECT * FROM mytable b
WHERE a.field1=b.field1
AND a.field2=b.field2
.............................
AND a.field57=b.field57
AND a.access_date<b.access_date
;

Second, using an ordered cursor (order by access_date), you have to compare current record with next record and, if they're &quot;almost duplicate&quot;, to delete the one with the lower access_date. And again, till the end of the cursor.

Note: You have to use the time component of the date datatype.


Eduard Stoleru
e_stoleru@yahoo.com

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top