Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Delete Duplicates - Leaving The Latest Record

Status
Not open for further replies.

PWD

Technical User
Jul 12, 2002
823
GB
Good afternoon. I have inherited some code that identifies duplicate records but, at the moment, we have to delete the unwanted lines manually. I'd like to modify this code so that it deletes duplicates, leaving the latest records in place.

Code:
SELECT MPI.NHSNO, MPI.ORG, MPI.DOB, MPI.SEX, MPI.ETHNIC, MPI.VETERAN, MPI.GPCD, MPI.POSTCODE, MPI.Date
FROM MPI

WHERE (((MPI.NHSNO) In (SELECT [NHSNO] FROM [MPI] As Tmp GROUP BY [NHSNO],[ORG] HAVING Count(*)>1  And [ORG] = [MPI].[ORG])))

ORDER BY MPI.NHSNO, MPI.ORG, MPI.Date DESC;

Any ideas please?

Many thanks,
D€$
 
Is the 'latest record' indentified by MPI.Date field?
And what is the PK field in MPI table?

Have fun.

---- Andy

A bus station is where a bus stops. A train station is where a train stops. On my desk, I have a work station.
 
Hi Andy.

1) Yes, the 'latest' is the newest date in MPI.Date.

2) This table doesn't have a Primary Key.

I have just tried appending the MPI table into and identically structured table that does have NHSNO as the PK but got an error message:

Microsoft Access can't append all the records in the append query. .... and it didn't add 408 record(s) to the table due to key violations......

Many thanks,
D€$
 
I think I may be able to do it in two stages:

Code:
SELECT MPI.NHSNO, MPI.DOB, MPI.SEX, MPI.ETHNIC, MPI.VETERAN, MPI.GPCD, MPI.POSTCODE, MPI.ORG, Max(MPI.Date) AS MaxOfDate INTO [New MPI]
FROM MPI
GROUP BY MPI.NHSNO, MPI.DOB, MPI.SEX, MPI.ETHNIC, MPI.VETERAN, MPI.GPCD, MPI.POSTCODE, MPI.ORG;

And

Code:
SELECT [New MPI].NHSNO, [New MPI].DOB, [New MPI].SEX, [New MPI].ETHNIC, [New MPI].VETERAN, [New MPI].GPCD, [New MPI].POSTCODE, [New MPI].ORG, [New MPI].MaxOfDate
FROM [New MPI]
WHERE ((([New MPI].NHSNO) In (SELECT [NHSNO] FROM [New MPI] As Tmp GROUP BY [NHSNO],[ORG] HAVING Count(*)>1  And [ORG] = [New MPI].[ORG])))
ORDER BY [New MPI].NHSNO, [New MPI].MaxOfDate DESC;

This just leave a handful that have to be deleted manually.

Not ideal, but the best I could do today.

Many thanks,
D€$
 
Table with no PK, that’s the problem in my opinion.
Could you introduce a (ID) field with AutoNumber so you can have a PK on that table? If so, you could ask for all records to keep:
[tt][blue]
Select Max(Date), ID
From MyTable
Group By ID[/blue]
[/tt]
so you can Delete all that you don’t want:
[tt]
Delete From MyTable
Where ID [red]NOT[/red] IN
([blue]Select Max(Date), ID
From MyTable
Group By ID[/blue])
[/tt]

Have fun.

---- Andy

A bus station is where a bus stops. A train station is where a train stops. On my desk, I have a work station.
 
Hi Andy, I have managed to renamne "New MPI" as "MPI" with a PK so I think I should be able to get out of jail today. I thought I understood the idea behind adding this Autonumber field but now I've added it I can't see how that will help as each duplicate date will have a different ID number. It is still early in the day though! :)

Many thanks,
D€$
 
I thought, "S*d it! I'm just going to copy this out into Excel and run a macro!"

Code:
Sub Delete_Duplicates()
'
'
Dim EndRow As Long

'use cells.rows.count rather than a hard row count value
EndRow = Cells(Cells.Rows.Count, 1).End(xlUp).Row

    For x = 3 To EndRow

        If Cells(x, 1) = Cells(x + 1, 1) And Cells(x + 1, 1) <> "" Then
        Rows(x + 1).EntireRow.Delete
        x = x - 1
        EndRow = EndRow - 1
        End If

    Next x
        
End Sub

Many thanks,
D€$
 
I just accomplished this yesterday in a table called tblHardware by temporarily using a table called tblHardwareNoDups. In Access VBA I did an insert into a new table, thereby defining it's column to match the original. Cleared that table and added a primary key to the field I wanted to eliminate duplicates on, inserted into it from the original table again which kept only the first instance, then cleared and wrote back to the original table, then deleted the temporary table. Feel free to use this if you'd like. I left comments in and it starts at 1.3 cause there's other stuff going on.

Code:
        '1.3 Run insert into into tblHardwareNoDups to set columns. This creates tblHardwareNoDups
        Set qd = db.CreateQueryDef("", "SELECT * INTO tblHardwareNoDups" _
                                    & " from tblHardware;")
        qd.Execute
        
        '1.4 Now that its columns are set, clear tblHardwareNoDups, set primary key, select non-nulls into it
        Set qd = db.CreateQueryDef("", "DELETE * FROM tblHardwareNoDups;")
        qd.Execute

        Set qd = db.CreateQueryDef("", "ALTER TABLE tblHardwareNoDups ADD PRIMARY KEY([Serial Number]) ;")
        qd.Execute

        Set qd = db.CreateQueryDef("", "INSERT INTO tblHardwareNoDups " _
                                     & "SELECT * from tblHardware WHERE tblHardware.[Serial Number] IS NOT NULL " _
                                     & "AND tblHardware.[Serial Number] NOT LIKE """";")
        qd.Execute
        
        '1.4 Delete from tblHardware
        Set qd = db.CreateQueryDef("", "DELETE tblHardware.* " _
        & "FROM tblHardware;")
        qd.Execute
        
        '1.5 Append all rows from tblHardwareNoDups to tblHardware
        Set qd = db.CreateQueryDef("", "INSERT INTO tblHardware SELECT tblHardwareNoDups.*" _
        & "FROM tblHardwareNoDups;")
        qd.Execute
        
        '1.6 Delete tblHardwareNoDups
        Set db = CurrentDb 'This line refreshes the list of DB objects, preventing error
        db.TableDefs.Delete "tblHardwareNoDups"
 
You may eliminate 1.4 if you add to 1.3:

Code:
[green]'1.3 Run insert into into tblHardwareNoDups to set columns. This creates EMPTY  tblHardwareNoDups[/green]
        Set qd = db.CreateQueryDef("", "SELECT * INTO tblHardwareNoDups" _
                                    & " from tblHardware [blue]WHERE 1 = 2[/blue];")
        qd.Execute

Have fun.

---- Andy

A bus station is where a bus stops. A train station is where a train stops. On my desk, I have a work station.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top