Technique to remove duplicates 1

bholm · Dec 4, 2001

What tecnique do you use to remove duplicates?
Let's say I have data set made up of the following columns:
id subid userid
1 27 3872
2 32 3872
3 51 3872
4 16 4218
5 19 5211

The subid is an id for the magazine subscription for that user.
But now let's say that duplicates have somehow gotten into the data...

id subid userid
1 27 3872
2 27 3872
3 32 3872
4 32 3872
5 51 3872
6 16 4218
7 16 4218
8 19 5211
9 19 5211
10 19 5211

What kind of SQL query would select just the extras (not the originals) so I can remove the dups using a DELETE command?

I can identify the userid's and subid's that have dupes but using that query data set to remove would remove everything.
Also note that there may be more than 1 duplicate as in the case of userid=5211 subid=19.

karluk · Dec 4, 2001

It looks as if the id column is a unique identifier for the table. If so, I think something like the following should work:

delete from your_table
where id not in
(select original_id from
(select min(id) as original_id, subid, userid from your_table
group by subid, userid)
)
;

Please keep in mind that deleting from a table is potentially dangerous. If you decide to try this suggestion, please make sure to back up your data and test thoroughly.

bholm · Dec 4, 2001

Thanks so much, Karluk.
I tried it and it eventually came up with the right results but it took forever. Someone on another forum gave me this script which works pretty fast and also gets the right results. It's very similar. But THANKS anyway!!!

delete from my_table
where subid || userid in
(select subid || userid from my_table
group by subid || userid
having count(*) > 1)
and id not in
(select min(id) from my_table
group by subid || userid
having count(*) > 1)

KALYANIDRV · Jan 2, 2002

Hi bholm,
Please find herewith attaching my query with solution. It is the best one.

DELETE FROM my_table WHERE ROWID NOT IN (SELECT MIN(ROWID) FROM my_table GROUP BY SUBID,USERID);

Please let me know on the same and please let me the feedback to the following id.

ravich_74@hotmail.com

Thanks,
Ravichandran,India.

nikhilwani · Jan 2, 2002

Type the query
Select row_id ,stud_id
from table name

Then get the row id of that row which you want to delete
then
do
delete table name
where row_id =
hope this will solve your problem

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Technique to remove duplicates 1

bholm

Programmer

karluk

MIS

bholm

Programmer

KALYANIDRV

Programmer

nikhilwani

Technical User

Similar threads

Part and Inventory Search

Sponsor