Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Deleting all but one of duplicate records

Status
Not open for further replies.

AccessAddict

Technical User
Apr 30, 2003
32
0
0
GB
Hi

I've just looked at a db built by a family friend, and discovered many duplicate entries in a table (amongst other things lol). I've created a Find Duplicate query and then a delete query based on the Find Duplicate query, but that will obviously delete ALL occurences of the duplicate data when I need to retain just ONE instance of the record. The table contains a lot of records so going through it manually would be quite a task.

If the Find Duplicate query finds eg ID no's 34, 451, 934 as records containing the duplicate data, how can I delete ONLY record numbers 34 and 451 leaving 934 intact?

The queries at the moment are
Code:
SELECT tblMailShot.Surname, tblMailShot.HouseNo, tblMailShot.Postcode, tblMailShot.MailShotID 
FROM tblMailShot 
WHERE (((tblMailShot.Surname) In (SELECT [Surname] FROM [tblMailShot] As Tmp GROUP BY [Surname],[HouseNo],[Postcode] HAVING Count(*)>1 And [HouseNo] = [tblMailShot].[HouseNo] And [Postcode] = [tblMailShot].[Postcode]))) 
ORDER BY tblMailShot.Surname, tblMailShot.HouseNo, tblMailShot.Postcode;

And
Code:
DELETE qryFindDupForMailShot.MailShotID 
FROM qryFindDupForMailShot;


Of course, fixing the problem as to how the duplicates got there in the first place is another story - or headache lol.

Thanks for your time

Cheers

Alan
 
Just open a recordset and go thru it and delete the dupes. Do you wish the earliest, the laterest or does not matter?

This code would do it.


private sub command0_click()
Dim rs as dao.recordset, ar(10) as variant, i as integer
DIM okay as boolean

set rs = currentdb.openrecordset("SELECT * From My Table ORDER BY [LNmae],[FName],[Addr];")

rs.movelast
rs movefirst
okay = true
for i = 0 to 9
ar(i) = rs.fields(i)
next
rs.movenext
do while not rs.eof
for i = 0 to 9
okay = okay & (rs.fields(i) = ar(i))
next i
if okay rs.delete
for i = 0 to 9
ar(i) = rs.fields(i)
next i
rs.movenext
loop
rs.close
set rs = nothing
exit sub


 
Just open a recordset and go thru it and delete the dupes. Do you wish the earliest, the laterest or does not matter?

This code would do it.


private sub command0_click()
Dim rs as dao.recordset, ar(10) as variant, i as integer
DIM okay as boolean

set rs = currentdb.openrecordset("SELECT * From My Table ORDER BY [LNmae],[FName],[Addr];")

rs.movelast
rs movefirst
okay = true
for i = 0 to 9
ar(i) = rs.fields(i)
next
rs.movenext
do while not rs.eof
okay = true
for i = 0 to 9
okay = okay & (rs.fields(i) = ar(i))
next i
if okay rs.delete
for i = 0 to 9
ar(i) = rs.fields(i)
next i
rs.movenext
loop
rs.close
set rs = nothing
exit sub


 
Hi Alan,

This seems like a really non-programming and non-thinking answer, but why don't you just make the find duplicates query into a make table query and change the criteria for counting the duplicate fields to having count > 0?

This would create a duplicate table with all of your records in the original table listed only once.

Just an idea.

garfunkel41
 
Hi

Many thanks for the replies and pointing me in the right direction. I would never have thought of using an array

Cheers

Alan
 
Hi AccessAddict:

I am doing a similar function as this and was wondering
if you got this to work. Did you use the code or the finddup >count 0 method? I tried the >count 0 method, but
could not get it to work.
 
Okay, well, I realize that you didn't go this way, AccessAddict, but just in case others were interested in what I was talking about . . .

Say I have a table called Table1 that looks like this:

Id Last Name First Name
1 Johnson Harrold
5 Arnold Benedict
3 Martin Dan
4 Smith John
1 Johnson Harrold

Id 1 is listed twice, which I don't want.

So, I create a query with the SQL statement:

SELECT First(Table1.Id) AS [Id Field], Table1.[Last Name], Table1.[First Name] INTO Table2
FROM Table1
GROUP BY Table1.Id, Table1.[Last Name], Table1.[First Name]
HAVING (((Count(Table1.Id))>0));

which creates a new table (Table2):
Id Field Last Name First Name
1 Johnson Harrold
3 Martin Dan
4 Smith John
5 Arnold Benedict

that has all the records listed once.

I only had it look for duplicates in the id field since the other information is all the same and I want the Id field to be my primary key, but if you wanted a multiple field primary key, then just change the Having statement to something like this:

HAVING (((Count(Table1.Id))>0) AND ((Count(Table1.[Last Name]))>0));


"I think there is a world market for maybe five computers." -Thomas Watson, chairman of IBM
 
Also take a look at the Access help file and do a search for duplicate record query. It gives a couple of ways to deal with duplicates and deleting them.
 
Hi All

I've been trying to use the code supplied by Rolliee (in my case it doesn't matter whether it's the earliest or latest record thats retained) but up till now haven't managed to get it to work (type mismatch errors etc particularly the line okay = okay & (rs.fields(i) = ar(i))) although the array idea seems like the way to do it?

I'm still working on a solution and if I come up with something workable I'll gladly post it here, or will be very pleased to see anyone else's solution to what seems a very basic problem <hmmph>

Cheers

Alan
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top