Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Excel Macro to find and delete duplicate records 1

Status
Not open for further replies.

gramaw

Technical User
Aug 14, 2002
9
US
Can somebody help me to write a macro for excel 2000 that will search a column of choice for duplicate records, then delete all duplicates it finds, including the original, and the corresponding information in their rows. For example, if the records in a row were a, b, c, c, c, d, e, once the macro was finished the only records left would be a, b, d, and e. it will need to delete all instances of c, and the corresponding information the row.

Ryan
 
Ryan,

Try the following procedure:

Code:
Sub DeleteDupes()
Dim Col As Long
Dim FirstRow As Long
Dim LastUsedRow As Long
Dim CurrentRow As Long
Dim SearchRange As Range
Dim FRng As Range
Dim OneCell As Range
Dim SearchString As String
Dim DupeFound As Boolean
Dim msg As String

  If Selection.Cells.Count > 1 Then
    msg = "Error: Multiple cells are selected."
    msg = msg & vbCrLf & vbCrLf & "Select a single cell only.  "
    msg = msg & "This cell should represent the starting row of data to operate on, in the column of interest."
    MsgBox msg, vbExclamation + vbOKOnly, "Delete Duplicated Items"
    Exit Sub
  End If
  Col = ActiveCell.Column
  FirstRow = ActiveCell.Row
  LastUsedRow = Cells(65536, Col).End(xlUp).Row
  If (LastUsedRow = 1) And (IsEmpty(Cells(1, Col).Value)) Then
    msg = "No items exist in this column; nothing deleted."
    MsgBox msg, vbInformation + vbOKOnly, "Delete Duplicated Items"
    Exit Sub
  ElseIf LastUsedRow = FirstRow Then
    msg = "Only a single item found; nothing deleted."
    MsgBox msg, vbInformation + vbOKOnly, "Delete Duplicated Items"
    Exit Sub
  ElseIf (LastUsedRow < FirstRow) Then
    msg = &quot;No items exist beyond the selected cell; nothing deleted.&quot;
    MsgBox msg, vbInformation + vbOKOnly, &quot;Delete Duplicated Items&quot;
    Exit Sub
  End If
  
  Application.ScreenUpdating = False
  With ActiveSheet
    CurrentRow = FirstRow
    Set SearchRange = .Range(.Cells(FirstRow, Col), .Cells(LastUsedRow, Col))
    Do
      Set OneCell = .Cells(CurrentRow, Col)
      DupeFound = False
      SearchString = OneCell.Text
      Set FRng = SearchRange.Find(what:=SearchString, LookIn:=xlValues, LookAt:=xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False)
      If Not FRng Is Nothing Then
        If FRng.Address <> OneCell.Address Then
          DupeFound = True
          Do
            FRng.EntireRow.Delete
            Set FRng = SearchRange.FindNext()
            If FRng Is Nothing Then Exit Do
          Loop While FRng.Row <> SearchRange.Row
        End If
      End If
      If Not DupeFound Then
        CurrentRow = CurrentRow + 1
      End If
    Loop Until CurrentRow > SearchRange.End(xlDown).Row
  End With
  Application.ScreenUpdating = True

  msg = &quot;Duplicated items removed.&quot;
  MsgBox msg, vbInformation + vbOKOnly, &quot;Delete Duplicated Items&quot;

End Sub

To use this, select the cell at the beginning of the data you want to eliminate duplicated items from. Then run the procedure. It will determine the end of the data, find all duplicated items and delete the entire row where they are found. There is also some initial testing to ensure the proper cell selection is made. I don't know how many rows of data you will typically operate on, but if this is large, it could take a while for this procedure to run. On my 1.2GHz pc, running this on 2000 rows of data takes approx. 3.7 seconds.

If you would like a demo workbook incorporating this, post your email address.

Regards,
M. Smith
 
I have done a little debugging with this, it works when I put values of a,b,c,c,c,d,e,e,e,f in the first column and enter column 1 as my search column. If you would like to see the data in a spreadsheet, post your email and I will send the file to you. Hope this does what your are looking for. Uncommenting the Application.ScreenUpdating statements will make the program run faster if you have a lot of data, all the updating will be done when the program ends. Some feedback on whether this works or not and what might be wrong would be nice to see as I spent about an hour doing this.

Sub DelDups()

Dim jRange As Range, Col As Integer

'Application.ScreenUpdating = False
Col = InputBox(&quot;Enter Column Number&quot;, &quot;Column Selector&quot;)
i = 1: k = 0
ActiveSheet.Columns(Col).Select
With Selection
Do
strTemp = ActiveSheet.Cells(i, Col)
Set c = .Find(strTemp)
If Not c Is Nothing Then
firstaddress = c.Address
Do
Set c = .FindNext(c)
If k = 1 Or firstaddress <> c.Address Then
Range(firstaddress).EntireRow.Select
Selection.Delete Shift:=xlUp
If .Find(strTemp) Is Nothing Then i = i - 1: GoTo exitloop
firstaddress = c.Address
k = 1
End If
Loop While Not c Is Nothing And k = 1 Or firstaddress <> c.Address
End If
exitloop:
i = i + 1
k = 0
Loop Until ActiveSheet.Cells(i, 1) = &quot;&quot;
End With

'Application.ScreenUpdating = True
End Sub


Regards,
Dave
 
I was plying with this a little more and found that if you have blanks in the column you are searching an error occured, so here is the fix. Again, any feedback would be nice.

Sub DelDups()

Dim jRange As Range, Col As Integer, rowLast As Integer

'Application.ScreenUpdating = False
ActiveSheet.UsedRange
rowLast = ActiveSheet.UsedRange.Rows.Count

Col = InputBox(&quot;Enter Column Number&quot;, &quot;Column Selector&quot;)
i = 1: k = 0
ActiveSheet.Columns(Col).Select
With Selection
Do
strTemp = ActiveSheet.Cells(i, Col)
If IsEmpty(strTemp) Then GoTo exitloop
Set c = .Find(strTemp)
If Not c Is Nothing Then
firstaddress = c.Address
Do
Set c = .FindNext(c)
If k = 1 Or firstaddress <> c.Address Then
Range(firstaddress).EntireRow.Select
Selection.Delete Shift:=xlUp
If .Find(strTemp) Is Nothing Then i = i - 1: GoTo exitloop
firstaddress = c.Address
k = 1
End If
Loop While Not c Is Nothing And k = 1 Or firstaddress <> c.Address
End If
exitloop:
i = i + 1
k = 0
Loop Until i > rowLast
End With

'Application.ScreenUpdating = True
End Sub
 
Dave,

I tried your procedure and it seems to get stuck in an infinite loop, if duplicates are present, preventing normal exiting. Breaking out manually does show the routine has completed its work of deleting duplicated entries, however.


Regards,
Mike
 
Mike,
Could you send me the file you are trying to use this on. I have tried this on a very short, simple set of data and the program ends properly. Please send file to dave.wilson@asl-tk.com.

Thanks,

Dave
 
dave and mike,
you guys are awesome. thanks for responding to my post. i haven't checked this in awhile because i didn't get any responses for the first couple of days. however, i will try it out first thing on monday morning. i will be able to send you the file then, as it is on my work computer. thanks a lot guys, sorry about the late response.

ryan
 
dave and mike,
i have made a test file of my own and have been testing both of your macros. since i am sending this to a more non-technical user, i will probably use dave's macro because there is less code, and it lets you enter the column number. however, it turns out i also need this macro to find blanks and delete those too. the other thing is, on my test data i'm using, it is deleting everything on the page when i pick column 2, and it shouldn't be. my email is gramaw@yahoo.com. if you guys want to email me i can send you the test file i'm using. i have both of your macros in the file so make sure you pick the correct one. mikes is called deletedupes and dave's is called deldups.

thanks again guys,

ryan
 
This looks like the code that had enough tweaks to make it do everything you needed. I'm glad you got what you needed.

This code sorts the data, is not case sensitive, trims blanks off the end of cells that have spaces on the end of the text and deletes duplicates including the original.

Code:
Sub SortDel()

Dim isort As Integer, FirstRow As Integer, RowCount As Integer
Dim DupFound As Boolean, cell As Range

DupFound = False
isort = InputBox(&quot;Enter sort column&quot;, &quot;Sort Column&quot;)
i = 0

ActiveSheet.UsedRange
ActiveSheet.UsedRange.Select
Selection.Sort Key1:=ActiveSheet.Columns(isort)

For Each cell In ActiveSheet.Columns(isort).Cells
skipNext:
  FirstRow = cell.Row
  Do
    If Right$(cell, 1) = &quot; &quot; Then cell.Value = RTrim(cell)
    If Right$(cell.Offset(1, 0), 1) = &quot; &quot; Then cell.Offset(1, 0).Value = RTrim(cell.Offset(1, 0))
    If LCase(cell.Offset(1, 0)) = LCase(cell) Then
      RowCount = RowCount + 1
      Set cell = cell.Offset(1, 0)
      If Right$(cell.Offset(1, 0), 1) = &quot; &quot; Then cell.Offset(1, 0).Value = RTrim(cell.Offset(1, 0))
      DupFound = True
    End If
  Loop Until FirstRow + RowCount > ActiveSheet.UsedRange.Columns(isort).Rows.Count Or LCase(cell.Offset(1, 0)) <> LCase(cell)
  If DupFound Then
    Range(Rows(FirstRow), Rows(FirstRow + RowCount)).Select
    Selection.Delete Shift:=xlUp
    DupFound = False
    RowCount = 0
    Cells(FirstRow, isort).Select
    Set cell = ActiveCell
    If FirstRow + RowCount >= ActiveSheet.UsedRange.Columns(isort).Rows.Count Then Exit Sub
    GoTo skipNext
  End If
Next

End Sub
 
when I run this superb code and the file is all duplicates, I get a set of duplicates left, if I run it again they disappear. If there are non duplictes in there too it works fine. Can you help?
But a very very useful bit of code
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top