Deleting all but one of duplicate records

AccessAddict · Aug 20, 2003

Hi

I've just looked at a db built by a family friend, and discovered many duplicate entries in a table (amongst other things lol). I've created a Find Duplicate query and then a delete query based on the Find Duplicate query, but that will obviously delete ALL occurences of the duplicate data when I need to retain just ONE instance of the record. The table contains a lot of records so going through it manually would be quite a task.

If the Find Duplicate query finds eg ID no's 34, 451, 934 as records containing the duplicate data, how can I delete ONLY record numbers 34 and 451 leaving 934 intact?

The queries at the moment are

Code:

SELECT tblMailShot.Surname, tblMailShot.HouseNo, tblMailShot.Postcode, tblMailShot.MailShotID 
FROM tblMailShot 
WHERE (((tblMailShot.Surname) In (SELECT [Surname] FROM [tblMailShot] As Tmp GROUP BY [Surname],[HouseNo],[Postcode] HAVING Count(*)>1 And [HouseNo] = [tblMailShot].[HouseNo] And [Postcode] = [tblMailShot].[Postcode]))) 
ORDER BY tblMailShot.Surname, tblMailShot.HouseNo, tblMailShot.Postcode;

And

Code:

DELETE qryFindDupForMailShot.MailShotID 
FROM qryFindDupForMailShot;

Of course, fixing the problem as to how the duplicates got there in the first place is another story - or headache lol.

Thanks for your time

Cheers

Alan

Rolliee · Aug 20, 2003

Just open a recordset and go thru it and delete the dupes. Do you wish the earliest, the laterest or does not matter?

This code would do it.

private sub command0_click()
Dim rs as dao.recordset, ar(10) as variant, i as integer
DIM okay as boolean

set rs = currentdb.openrecordset("SELECT * From My Table ORDER BY [LNmae],[FName],[Addr];&quot

rs.movelast
rs movefirst
okay = true
for i = 0 to 9
ar(i) = rs.fields(i)
next
rs.movenext
do while not rs.eof
for i = 0 to 9
okay = okay & (rs.fields(i) = ar(i))
next i
if okay rs.delete
for i = 0 to 9
ar(i) = rs.fields(i)
next i
rs.movenext
loop
rs.close
set rs = nothing
exit sub

Rolliee · Aug 20, 2003

Just open a recordset and go thru it and delete the dupes. Do you wish the earliest, the laterest or does not matter?

This code would do it.

private sub command0_click()
Dim rs as dao.recordset, ar(10) as variant, i as integer
DIM okay as boolean

set rs = currentdb.openrecordset("SELECT * From My Table ORDER BY [LNmae],[FName],[Addr];&quot

rs.movelast
rs movefirst
okay = true
for i = 0 to 9
ar(i) = rs.fields(i)
next
rs.movenext
do while not rs.eof
okay = true
for i = 0 to 9
okay = okay & (rs.fields(i) = ar(i))
next i
if okay rs.delete
for i = 0 to 9
ar(i) = rs.fields(i)
next i
rs.movenext
loop
rs.close
set rs = nothing
exit sub

garfunkel41 · Aug 20, 2003

Hi Alan,

This seems like a really non-programming and non-thinking answer, but why don't you just make the find duplicates query into a make table query and change the criteria for counting the duplicate fields to having count > 0?

This would create a duplicate table with all of your records in the original table listed only once.

Just an idea.

garfunkel41

AccessAddict · Aug 20, 2003

Hi

Many thanks for the replies and pointing me in the right direction. I would never have thought of using an array

Cheers

Alan

S01paul · Aug 20, 2003

Hi AccessAddict:

I am doing a similar function as this and was wondering
if you got this to work. Did you use the code or the finddup >count 0 method? I tried the >count 0 method, but
could not get it to work.

garfunkel41 · Aug 21, 2003

Okay, well, I realize that you didn't go this way, AccessAddict, but just in case others were interested in what I was talking about . . .

Say I have a table called Table1 that looks like this:

Id Last Name First Name
1 Johnson Harrold
5 Arnold Benedict
3 Martin Dan
4 Smith John
1 Johnson Harrold

Id 1 is listed twice, which I don't want.

So, I create a query with the SQL statement:

SELECT First(Table1.Id) AS [Id Field], Table1.[Last Name], Table1.[First Name] INTO Table2
FROM Table1
GROUP BY Table1.Id, Table1.[Last Name], Table1.[First Name]
HAVING (((Count(Table1.Id))>0));

which creates a new table (Table2):
Id Field Last Name First Name
1 Johnson Harrold
3 Martin Dan
4 Smith John
5 Arnold Benedict

that has all the records listed once.

I only had it look for duplicates in the id field since the other information is all the same and I want the Id field to be my primary key, but if you wanted a multiple field primary key, then just change the Having statement to something like this:

HAVING (((Count(Table1.Id))>0) AND ((Count(Table1.[Last Name]))>0));

"I think there is a world market for maybe five computers." -Thomas Watson, chairman of IBM

pweegar · Aug 21, 2003

Also take a look at the Access help file and do a search for duplicate record query. It gives a couple of ways to deal with duplicates and deleting them.

AccessAddict · Aug 21, 2003

Hi All

I've been trying to use the code supplied by Rolliee (in my case it doesn't matter whether it's the earliest or latest record thats retained) but up till now haven't managed to get it to work (type mismatch errors etc particularly the line okay = okay & (rs.fields(i) = ar(i))) although the array idea seems like the way to do it?

I'm still working on a solution and if I come up with something workable I'll gladly post it here, or will be very pleased to see anyone else's solution to what seems a very basic problem <hmmph>

Cheers

Alan

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Deleting all but one of duplicate records

AccessAddict

Technical User

Rolliee

Programmer

Rolliee

Programmer

garfunkel41

Technical User

AccessAddict

Technical User

S01paul

Programmer

garfunkel41

Technical User

pweegar

Programmer

AccessAddict

Technical User

Similar threads

Part and Inventory Search

Sponsor