Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

deletion of a duplicating substring 1

Status
Not open for further replies.

biry

Technical User
Nov 5, 2004
127
0
0
CA
hi,

I have a string containing html code and though the string, And in no particular tagging order i have a duplicate HTML lines that i would like to eliminate the second one, a kind of find the sections of the string where this is duplicating right afer itself and delete the second repeater. Another thing is that the time stamp in the lines are always different, so i'm not sure if wildcards can be used for that. Either way the duplicating line is always identicle, i just don't know where these duplicating lines exist in the sting and what the timestampe in the line is, as it is always changing

Basically i would like the second line of the 2 to be deleted throughout the string. If this possible?


<td valign="top" nowrap="nowrap" align="left">2005-02-01</td>
<td valign="top" nowrap="nowrap" align="left">2005-02-01</td>
 
Are you always dealing with data in a table like you presented? Is the duplicate always immediately after the original?

If so, you could use Split() on the data, and then do a loop through the array. each time the element in the array matches the previous element, remove it from the array. Once done, use a join on the array to arrive at your string once again.

You might need to do a little extra coding for the beginning and end - they may not split quite as easily.
 
can you help me get started on this, my VB skills suck as i'm just learning. How would i use split on the data to loop through an array

thanks

The answer to both your questions are yes.
 
Here's a function that should do what you want. Just pass your string to this function and it should remove the duplicates and pass it back.

Code:
Private Function removedups(instring As String) As String

Dim holder() As String 'just a temp array to hold each line of text
Dim i As Long

holder = Split(instring, vbCrLf) 'breaks your string into individual lines
For i = 1 To UBound(holder())
   If holder(i) = holder(i - 1) Then 'check if it matches the previous line
       holder(i) = ""                'if so, make the line blank
   End If
Next i

For i = 0 To UBound(holder()) 'add all the lines back together
    
    If holder(i) <> "" Then                 'if the line is not blank, add it back in to the string
        removedups = removedups + holder(i) + vbCrLf
    End If
Next i

End Function

Good luck,
Ryan
 
thanks! that was bigger help than expected. I will use it and tag Motor11 in as the developer, thanks again!
 
Long time members of this forum just know that there's going to be a RegExp solution to this...so here is the RegExp equivalent to Motor11's solution:

Code:
[blue]Private Function removedups(instring As String) As String
    With CreateObject("vbscript.regexp")
        .Global = True
        .MultiLine = True
        .Pattern = "(^[\s\S]+$)\1"
        removedups = .Replace(instring, "$1")
    End With
End Function[/blue]

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top