Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

VBA Delete Dupes ---SLOW

Status
Not open for further replies.

OMG_VBA_IS_GREAT

Technical User
Dec 1, 2017
16
US
Hi All,
The below code finds dupes and deletes the dupe(s) with the lowest number located in column "BL". The code works great for small set of data (8k rows). I am trying to use the code for over a 500,000 cells takes over three hours to run. Any help would be greatly appreciated.

Sub DeleteDups()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False

Dim sh As Worksheet, lr As Long
Sheets("A").Select
Set sh = Sheets("A")
lr = Cells.Find(What:="*", After:=[A1], SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row


ColLoc = "F"
y = "BL"


For i = lr To 2 Step -1
With sh
Set fnRng = .Range(ColLoc & 2, .Cells(i - 1, y)).Find(.Cells(i, y).Value, , xlValues, xlWhole)
If Not fnRng Is Nothing Then
If fnRng.Offset(0, 58) > .Cells(i, y).Offset(0, 58) Then
.Rows(i).Delete
ElseIf fnRng.Offset(0, 58) < .Cells(i, y).Offset(0, 58) Then
fnRng.EntireRow.Delete
End If
End If
End With
Next
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
Application.EnableEvents = True
End Sub
 
Hi,

SORT your table on column BL.

Create a helper column with the formula, assuming that your helper column is in ZZ...
[tt]
=IF(BL2=BL1,ZZ1+1,1)
[/tt]

Filter your table on the helper column on any value except 1.

Delete each visible row other than the heading row.

Remove the filter.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Thanks Skip. I sorted column F and BL. Column F contains my IDs (which can be many from 2 to 15 duplicates) and column BL contains row number using (ROW() function.) The goal of the macro is to only keep only one unique ID with the largest row number. Unfortunately, the macro above can not handle large data set of 500,000 rows. It takes almost 2 hours to run.
 
The goal of the macro is to only keep only one unique ID with the largest row number.

Why the largest?

Why not keep the FIRST? Again using Auto Filter hide 1. Delete the rest!

Two steps! Just a few seconds!

Why not???

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
BTW that peocess can be automated. Just turn on your macro recorder and recordthose steps.

Post back with your recorded code to customize selecting the visible rows.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Hi Skip,

Thanks for your help. Sorry for not being clear. I get a daily download of IDs and other details. I need to take the most recent ID. The IDs are incremented with the latest version is in the largest row number. What you provided takes the older version of the ID instead the latest version. There is no version number in the file I receive. Hoping you can help.
 
=Version=COUNTA(ID)

When the count of the ID equals the version, the expression will be TRUE.

Delete all FALSE rows.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Hi Skip,

I applied what you suggested and it worked. Thank you for your help! Simpler is sometimes better.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top