Hi All,
I came across this code to delete duplicate rows from a table in SQL 2000. I have tested it and it seems to do the trick, but I don't understand how it does it.
Basically there is a table with a unique field, in this case [EmployeeID], then we have numerous other columns that we want to check for duplicates and then delete the entire rows/s.
Can anyone explain how it does it? As I mentioned I have tested it on small sample data, but now I would like to implement it on live data (circa 30 million rows)
Thanks in advance.
I came across this code to delete duplicate rows from a table in SQL 2000. I have tested it and it seems to do the trick, but I don't understand how it does it.
Basically there is a table with a unique field, in this case [EmployeeID], then we have numerous other columns that we want to check for duplicates and then delete the entire rows/s.
Code:
Delete table_name
where [EmployeeID] not in
(select min([EmployeeID]) from table_name
group by column1, column2, etc);
Can anyone explain how it does it? As I mentioned I have tested it on small sample data, but now I would like to implement it on live data (circa 30 million rows)
Thanks in advance.