Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to find duplicate records in a table?

Status
Not open for further replies.

VBUser77

MIS
Jan 19, 2005
95
0
0
US
I like to find duplicate records in an access table and also want to delete the duplicate records once I find them. How could I do it in Access Query or Access VBA?

Thanks in advance for your help.

Jay
 
As a starting point you may follow the find duplicates query wizard.

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Thanks, That wizard helped. Once I know what my duplicate records are, how could I go about deleteing them?

Thanks!
 
I tried something new the other day and the simplisity of the technique surprised me. The idea just sort of popped into my head. Not sure how you might be able to 'automate' this though. Here's what I did:
1. Create a copy of the table using copy/paste.
2. Name the copied table something like 'tblNew1' where the original table name would be 'tblNew'.
3. Delete all the records from tblNew1
4. Create a primary key in tblNew1 that would not allow any duplicate records.
5. Use copy/paste to append data from the original tblNew into tblNew1.
You will get an error message stating x number of records were lost due to key violations. Those were your duplicate records! They are now gone. Confirm no valuable records were lost during this process (unlikely) before deleting the original tblNew and renaiming tblNew1 to the original table name.
I'm sure this could all be done using code, in which case this process could be automated. Still, it beats the heck out of running a 'find duplicates' query and deleting the duplicate records one by one.
I am assuming the records are you are working with are being imported from another source and are not being erroneously created by users on your live database. Otherwise, you could prevent users from creating duplicate records to begin with, through the proper application of a primary key.

Hope this helps.

Dry :)
 
create a query using the find duplicates query wizard. create it that way, that all fields of the originating table (and so all duplicate records) are listed.
if you are checking all fields for duplicates, you will need an additional "dummy" field in your table, that you have to exclude from the "duplicate-value-fields" and to add in the third step of the wizard ("additional query fields").
please note, that the query generated by the wizard will not list records, where there is any empty field in the "duplicate-value-fields" section. to also list these records, you have to modify the query as follows (example of an appropriate find duplicates query on a table with 8 fields plus one "Dummy" field as described above). the modifications to include records with empty fields are in [red]red[/red]:

Code:
SELECT Table1.FieldA, Table1.FieldB, Table1.FieldC, Table1.FieldD, Table1.FieldE, Table1.FieldF, Table1.FieldG, Table1.FieldH, Table1.Dummy
FROM Table1
WHERE (((Table1.FieldA) In (SELECT [FieldA] FROM [Table1] As Tmp GROUP BY [FieldA],[FieldB],[FieldC],[FieldD],[FieldE],[FieldF],[FieldG],[FieldH] 
HAVING Count(*)>1  And [red]([/red][FieldB] = [Table1].[FieldB] [red]or [FieldB] IS NULL)[/red] And [red]([/red][FieldC] = [Table1].[FieldC] [red]or[/red] [FieldC] IS NULL) And [red]([/red][FieldD] = [Table1].[FieldD] [red]or [FieldD] IS NULL)[/red] And [red]([/red][FieldE] = [Table1].[FieldE] [red]or [FieldE] IS NULL)[/red] And [red]([/red][FieldF] = [Table1].[FieldF] [red]or [FieldF] IS NULL)[/red] And [red]([/red][FieldG] = [Table1].[FieldG] [red]or [FieldG] IS NULL)[/red] And [red]([/red][FieldH] = [Table1].[FieldH] [red]or [FieldH] IS NULL)[/red])))
ORDER BY Table1.FieldA, Table1.FieldB, Table1.FieldC, Table1.FieldD, Table1.FieldE, Table1.FieldF, Table1.FieldG, Table1.FieldH;

name the query "qry_Duplicates".

run the following function to delete the duplicates:

Code:
Function DeleteDuplicates()

Dim db As DAO.Database
Dim rs As DAO.Recordset
Set db = Application.CurrentDb
Set rs = db.OpenRecordset("qry_Duplicates")


Do Until rs.EOF
rs.MoveFirst
rs.delete
rs.Close
Set rs = Nothing
Set rs = db.OpenRecordset("qry_Duplicates")
Loop


End Function

HTH,
fly

[blue]Typos, that don't affect the functionality of code, will not be corrected.[/blue]

Martin Serra Jr.
[blue]Database_Systems and _Applications shared across all Business_Areas[/blue]
 
flyover789,
I haven't tested your code but it appears to me this code would delete every single record returned by the query. I may be wrong but I thought VBUser77's intention was to preserve the records and only delete one of each duplicate record?
Example:
Record 1
Record 1 **Delete**
Record 2
Record 2 **Delete**

Dry :)
 
through closing and reopening the recordset after each delete,the function deletes only the duplicate records.

Greetings,
fly

[blue]Typos, that don't affect the functionality of code, will not be corrected.[/blue]

Martin Serra Jr.
[blue]Database_Systems and _Applications shared across all Business_Areas[/blue]
 
PHV you are the bomb! But I have another question...if I want to delete duplicates based on matches of two fields, would I just add that field to the SQL as such?:

DELETE [DF]And[DF2]
FROM [TN]
WHERE [ID] Not In (SELECT Max([ID]) FROM [TN] GROUP BY [DF] and [DF2]);

 
duplicates based on matches of two fields
DELETE FROM yourTable
WHERE ID Not In (SELECT Max(ID) FROM yourTable GROUP BY DupField1, DupField2)

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Hi again, for some reason Access is freezing up every time I try to view the datasheet or run the query. Have I done something wrong?

DELETE [Vendor Number]
FROM Vendor_ListingDupeTest
WHERE (((Vendor_ListingDupeTest.ID) Not In (SELECT Max([ID]) FROM [Vendor_ListingDupeTest] GROUP BY [Vendor Number],[Vendor Name],[Address 1])));

Thanks!!!

Dawn

 
Is Vendor_ListingDupeTest a query or a table ?

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top