Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

delete duplicate rows from large table

Status
Not open for further replies.

MikeM2468

IS-IT--Management
Apr 5, 2011
100
US
I have two tables table1 and table2. Each is over 1 million rows. Table2 has some records that are also in table1. I need to delete the rows from table2 that are also in table1. I've tried this:

Code:
delete * from table2 where record in (select * from table1 where record like "1%");

Both tables only have one column called record. If I run the above command it just chugs for hours and never finishes.
 
Hi

Avoid [tt]in[/tt] with sub-[tt]select[/tt]. Take advantage of MySQL's [tt]delete[/tt] syntax which allows [tt]join[/tt] with other tables :
Code:
[b]delete[/b] [b]from[/b] table2

[b]using[/b] table1
[b]inner[/b] [b]join[/b] table2

[b]where[/b] table1[teal].[/teal]record[teal]=[/teal]table2[teal].[/teal]record
[b]and[/b] table1[teal].[/teal]record [b]like[/b] [green][i]'1%'[/i][/green][teal];[/teal]

Feherke.
[link feherke.github.com/][/url]
 
I get the same results with this query. Just runs forever and never does anything.
 
Hi

There is another way, in two steps. Should also be faster, if the previous one not helped, probably neither this one will. This supposes that table2.record has no primary or unique key set :
Code:
[b]update[/b] table2[teal],[/teal]table1

[b]set[/b] table2[teal].[/teal]record[teal]=[/teal][b]null[/b]

[b]where[/b] table1[teal].[/teal]record[teal]=[/teal]table2[teal].[/teal]record
[b]and[/b] table1[teal].[/teal]record [b]like[/b] [green][i]"1%"[/i][/green][teal];[/teal]

[b]delete[/b] [b]from[/b] table2

[b]where[/b] record [b]is[/b] [b]null[/b][teal];[/teal]

Feherke.
[link feherke.github.com/][/url]
 
I think it's doing the same thing. What can I do with my data to make it work? Any tuning I can do the the server? Indexing? I'm running this in dev on Windows 7 64-bit 8GB RAM, MySQL 5.5.12. The two tables have just one column containing a list of words with a maximum length of 20 characters. Table2 has a bunch of entries that should already be in table1. I need to delete the duplicates from table2. I'm running the queries from the mysql CLI.
 
Hi

MySQL knowledge is mostly theoretical as I use it rarely. So this may be plain stupid idea, but first I would try to check the database for damages. ( I think [tt]mysqlcheck[/tt] serves that purpose. )

Feherke.
[link feherke.github.com/][/url]
 
Try this:

Code:
delete from table2, table1 where exists 
  (select 1 from table1 where table2.record = table1.record);

Regards,

Ed
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top