Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

scan for repeating values and then delete 1

Status
Not open for further replies.

peach255

Programmer
Jan 20, 2003
29
US
I would like to loop through Column B, and if I find the same repeating value, then I would like to delete the rows and keep only one. I would always have 4 replicates, but would only like to keep one. For ex., Loop through Column B, and if found "A2" multiple times, then delete 3 out of the 4 rows. The values would vary.

I have the following so far, and don't know how to check to see if the values are repeating.

For lRow = Range(Cells(1, 2), Cells(Cells.Rows.Count, 2).End(xlUp)).Rows.Count To 1 Step -1
......
Next

Thank you!
 
Assuming you have your data sorted on column B, then this macro will delete all of the duplicates (based on Column B).
[blue]
Code:
Sub RemoveDups()
Dim nRow As Long
  Application.ScreenUpdating = False
  With Intersect(ActiveSheet.UsedRange, Range("B:B"))
    For nRow = .Rows.Count To 2 Step -1
      If .Cells(nRow, 1).Value = .Cells(nRow - 1).Value Then
        .Cells(nRow, 1).EntireRow.Delete
      End If
    Next nRow
  End With
  Application.ScreenUpdating = True
End Sub
[/color]

 
Thanks for your reply. Is there a way I can check this w/out sorting Column B first? The reason I do not want to sort it b/c there are some rows that I do not want to sort.

Thank you so much!
 
Yes, but that would make the macro more complex. Much simpler to add a temporary column to your data area and fill it with row numbers. Then after sorting and running the macro, you can sort on that column to put the sheet back in the same order that it was, and then finally delete the temporary column.

Is this a one-time thing or something you are going to need repeatedly? If you need to do this often, then I can put together a more complex macro to do it without sorting.

 
This is something that I would need to do often. I thought of another method that I can determine which cells to sort though. On Column A, I would have cells w/ values ranging from 3 to 382 (w/ some skip #s in btwn). Then there would be some miscellaneous rows of other data, and then another range of values from 3 to 382 in Column A. Is there a way that I can have it read Column A first and look for the 1st "3" till the 1st "382" and sort cells next to them in Column B, and then repeat this for the next set of "3" till "382"?
After I sort those, then I can use your previous code to delete the duplicates.
Thank you so much!
 
It would help if you could be a little more precise with specifying how your data looks. I get the impression that it's something like this:
Code:
     A      B     
1    3   Things
2    4   Gadgets
3   10   Things
4   11   Items
5   12   Gadgets
6
7    3   Gadgets
8   10   Things
9   11   Things
10  12   Items
11  13   Things
12
13   4   Gadgets
14   5   Things
15  10   Things
And you want it to look like this when you are done:
Code:
     A      B     
1    3   Things
2    4   Gadgets
3   11   Items
4
5    3   Gadgets
6   10   Things
7   12   Items
8 
9    4   Gadgets
10   5   Things
Although I could imagine a couple of different interpretions of your post. Is this something like what you want?

If I were going by your original post: ...if I find the same repeating value, then I would like to delete the rows and keep only one... I would have guessed you wanted something like this:
Code:
     A      B     
1    3   Things
2    4   Gadgets
3   11   Items
What do you really want?
 
Sorry for not being as precise. My data look something like this:

A B
1 3 Gadgets
2 4 Gadgets
3 5 Things
4 6 Things
5 7 Items
6 12 Gadgets
7 13 Gadgets
.....

387 3 Gadgets
388 4 Gadgets
389 5 Things
390 6 Things

and I would like to delete repeating rows with same values in Column B... I would like the following results:

A B
1 3 Gadgets
3 5 Things
5 7 Items

387 3 Gadgets
389 5 Things

Thank you!
 
I assume you meant:
Code:
    A      B
1   3      Gadgets
2   5      Things
3   7      Items
4
5   3      Gadgets
6   5      Things
But how do you define the groups?
A. Is there always a blank line between?
B. Do the numbers in column A always increase within groups?
(If so, what about the pathological case where one group only has 3,5 and 7 and the next group begins with 9?)

C. You said ...there would be some miscellaneous rows of other data... -- Are the cells from column A always blank for the interval rows?

As a programmer, I'm sure you can appreciate the need for precision in the problem statement to be able to develop a fool-proof algorithm.
 
Zathras,

a) Yes, there is always a blank line in btwn the 2 groups.

b) The #s in Column A do not always increase within the groups. But the #s for the 2 groups always start with 3 and end with 382 ... some #s may be missing in btwn.

c) There would be the miscellaneous rows of data, then a blank row, and then the next set of data but w/ the heading first.

A more accurate depiction would be:
A B
Well Sample
1 3 Gadgets
2 4 Gadgets
3 5 Things
4 6 Things
5 7 Items
6 12 Gadgets
7 13 Gadgets
Miscellaenous stuff, blah blah
blah blah

Well Sample
387 3 Gadgets
388 4 Gadgets
389 5 Things
390 6 Things
 
Ok. I guessed as much, so while I was waiting for your post I worked out the following:
[blue]
Code:
Option Explicit
Const COL_TESTGROUP = 1
[green]
Code:
'Column "A"
[/color]
Code:
Const COL_TESTDUPLICATE = 2
[green]
Code:
'Column "B"
[/color]
Code:
Const ROW_FIRSTGROUP = 2
[green]
Code:
'Row 2
[/color]
Code:
Sub DeleteDuplicates()
[green]
Code:
' Deletes duplicate items within each group on the sheet
' A group is a collection of rows that are separated
' by one or more blank cells in column "A"
' Duplicates are determined by examining the
' contents of column "B"
[/color]
Code:
Dim nGroupLastRow As Long
Dim rng As Range

  nGroupLastRow = Cells(ROW_FIRSTGROUP, COL_TESTGROUP).End(xlDown).Row
  Set rng = Range(Cells(ROW_FIRSTGROUP, COL_TESTGROUP), _
                  Cells(nGroupLastRow, COL_TESTGROUP))
  Call RemoveDups(rng.Offset(0, COL_TESTDUPLICATE - COL_TESTGROUP))
  While NextGroup(rng)
      Call RemoveDups(rng.Offset(0, COL_TESTDUPLICATE - COL_TESTGROUP))
  Wend
  
End Sub

Sub RemoveDups(AGroup As Range)
[green]
Code:
' AGroup is a single column range.
' Rows are deleted when the same value appears in multiple rows.
' The last (highest row number) instance is retained.
[/color]
Code:
Dim nRow As Long
Dim r As Range
  If AGroup.Rows.Count > 1 Then
    Application.ScreenUpdating = False
    With AGroup
      For nRow = .Rows.Count To 2 Step -1
        Set r = Range(.Cells(1, 1), .Cells(nRow - 1, 1)).Find(.Cells(nRow).Value, _
                    .Cells(nRow - 1, 1), xlValues, xlWhole, xlByColumns, xlPrevious)
        If Not r Is Nothing Then
          .Cells(nRow).EntireRow.Delete
        End If
      Next nRow
    End With
    Application.ScreenUpdating = True
  End If
End Sub

Private Function NextGroup(ByRef AGroup As Range) As Boolean
[green]
Code:
' Finds the "next" group below a given group
' A "Group" is a single column range.
' The "next" group starts with the first non-blank
' cell following all blank cells below the
' given range.
[/color]
Code:
Dim nFirstRow As Long
Dim nGroupLastRow As Long
  With AGroup
    nFirstRow = .Cells(.Rows.Count, 1).End(xlDown).Row
    If nFirstRow >= 65536 Then
      Set AGroup = Nothing
    Else
      If IsEmpty(.Cells(nFirstRow + 1, 1)) Then
        Set AGroup = .Cells(nFirstRow, 1)
      Else
        nGroupLastRow = .Cells(nFirstRow, 1).End(xlDown).Row
        Set AGroup = Range(.Cells(nFirstRow, 1), _
                           .Cells(nGroupLastRow, 1))
      End If
    End If
  End With
  If AGroup Is Nothing Then
    NextGroup = False
  Else
    If IsEmpty(AGroup.Cells(1, 1)) Then
      NextGroup = False
    Else
      NextGroup = True
    End If
  End If
End Function
[/color]

 
Zathras,
Thank you so much for your help! I really appreciate it. I ran your code on the file, and it did not seem to scan the entire sheet to delete duplicates in Column B. It seemed to only start at Row 2 only and until it reads the 1st blank, and then it does not seem to read the next group after the blank line.
I try to step through the code, but could not figure out where the problem is.
Thank you again!
 
The constant declarations may need to be adjusted for your situation:
[blue]
Code:
Const COL_TESTGROUP = 1
[green]
Code:
'Column "A"
[/color]
Code:
Const COL_TESTDUPLICATE = 2
[green]
Code:
'Column "B"
[/color]
Code:
Const ROW_FIRSTGROUP = 2
[green]
Code:
'Row 2
[/color]
Code:
[/color]

The third one in particular tells the macro to start on row 2. I did it that way since your data indicated column headings in row 1. ("Well" and "Sample")

It works fine with my test data. Please provide a precise set of test data that you used for which this isn't working so I can see what needs to be changed.

 
Zathras,
I retested it again on the sheet and adjusted the data around, and it did scan the next group! The problem was that there are a couple of headings before the 1st group of data w/ blank lines in btwn, so I believe I have to have a Loop to do the scanning and deleting until there are empty cells. There are a couple of blank lines in the heading before the 1st group of data. Where can I add the Looping for it to scan the whole sheet?

The data looked something like this w/ the heading at first:
1 Filename
2 Bloop
3
4 Bloop
5 Bloop
6
7 Well Sample
8 1 A2
9 2 A2
10 3 A3

Thank you very much!
 
Ah. I remember as I wrote it I saw that there would be a problem if you had a single line group as the first group. But I was hoping that the data really looked like what you had posted. That's what I get for taking a short cut.

Give me some time. I'll post a correction tonight. In the meantime, the workaround is to be sure that the FIRST group has at least 2 lines. (Single line groups further down are handled correctly.)
 
Ok, here is the corrected code. Sorry for the confusion.
[blue]
Code:
Option Explicit
Const COL_TESTGROUP = 1
[green]
Code:
'Column "A"
[/color]
Code:
Const COL_TESTDUPLICATE = 2
[green]
Code:
'Column "B"
[/color]
Code:
Const ROW_FIRSTGROUP = 2
[green]
Code:
'Row 2
[/color]
Code:
Sub DeleteDuplicates()
[green]
Code:
' Deletes duplicate items within each group on the sheet
' A group is a collection of rows that are separated
' by one or more blank cells in column "A"
' Duplicates are determined by examining the
' contents of column "B"
[/color]
Code:
Dim nGroupLastRow As Long
Dim rng As Range

  Set rng = Range(Cells(ROW_FIRSTGROUP, COL_TESTGROUP), _
                  Cells(ROW_FIRSTGROUP, COL_TESTGROUP))
  If Not IsEmpty(rng.Offset(1, 0)) Then
    nGroupLastRow = rng.Cells(1, 1).End(xlDown).Row
    Set rng = Range(Cells(ROW_FIRSTGROUP, COL_TESTGROUP), _
                    Cells(nGroupLastRow, COL_TESTGROUP))
  End If
  MsgBox rng.Address
  Call RemoveDups(rng.Offset(0, COL_TESTDUPLICATE - COL_TESTGROUP))
  While NextGroup(rng)
      MsgBox rng.Address
      Call RemoveDups(rng.Offset(0, COL_TESTDUPLICATE - COL_TESTGROUP))
  Wend
  
End Sub

Sub RemoveDups(AGroup As Range)
[green]
Code:
' AGroup is a single column range.
' Rows are deleted when the same value appears in multiple rows.
' The last (highest row number) instance is retained.
[/color]
Code:
Dim nRow As Long
Dim r As Range
  If AGroup.Rows.Count > 1 Then
    Application.ScreenUpdating = False
    With AGroup
      For nRow = .Rows.Count To 2 Step -1
        Set r = Range(.Cells(1, 1), .Cells(nRow - 1, 1)).Find(.Cells(nRow).Value, _
                    .Cells(nRow - 1, 1), xlValues, xlWhole, xlByColumns, xlPrevious)
        If Not r Is Nothing Then
          .Cells(nRow).EntireRow.Delete
        End If
      Next nRow
    End With
    Application.ScreenUpdating = True
  End If
End Sub

Private Function NextGroup(ByRef AGroup As Range) As Boolean
[green]
Code:
' Finds the "next" group below a given group
' A "Group" is a single column range.
' The "next" group starts with the first non-blank
' cell following all blank cells below the
' given range.
[/color]
Code:
Dim nFirstRow As Long
Dim nGroupLastRow As Long
  With AGroup
    nFirstRow = .Cells(.Rows.Count, 1).End(xlDown).Row
  End With
  If nFirstRow >= 65536 Then
    Set AGroup = Nothing
  Else
    If IsEmpty(Cells(nFirstRow + 1, 1)) Then
      Set AGroup = Cells(nFirstRow, 1)
    Else
      nGroupLastRow = Cells(nFirstRow, 1).End(xlDown).Row
      Set AGroup = Range(Cells(nFirstRow, 1), _
                         Cells(nGroupLastRow, 1))
    End If
  End If
  If AGroup Is Nothing Then
    NextGroup = False
  Else
    If IsEmpty(AGroup.Cells(1, 1)) Then
      NextGroup = False
    Else
      NextGroup = True
    End If
  End If
End Function
[/color]

 
Oops. Please remove the two MsgBox lines from the main macro. They are there only for debugging. Thanks.
 
Thanks so much Zathras! It works great!!! Thank you so much again! I really appreciate it!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top