Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Deleting duplicate records from large table

Status
Not open for further replies.

scabral79

MIS
May 16, 2007
25
US
Hi,

i have a very large table (over 200 million rows) that i need to delete duplicate records from. Here a sample of the table:

ID CustNum QuestionID Response SystemID
--- ------ ---------- -------- --------
1 100 5 Yes 87
2 100 5 Yes 87
3 100 6 Yes 89
4 200 3 No 87
5 200 3 No 87

From the above sample, i would need to delete record ID's 2 and 5, since these are duplicate records of ID's 1 and 4. Since this is a very large table, anything i try to do with it takes forever and eats up transaction log space.

Does anyone have an idea on the most efficient way to find and remove these duplicate records?

Thanks
scott
 
partition your deletes, e.g. delete all duplicates where id between 1 and 1000000, then 1000001 and 2000000...

at the end of each delete, you can clear down the log, or set your log level to simple or bulk...

--------------------
Procrastinate Now!
 
Here's an FAQ to help

It is most important that you set a unique index on those fields once you are done cleaning the data, so that the problem will not re-occur.

"NOTHING is more important in a database than integrity." ESquared
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top