Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

String filtering

Status
Not open for further replies.

Mengbillar

Programmer
Jan 13, 2003
32
DE
I have a slight speed problem. I have one veeery long string, and one single character which occurs quite often in it. now i want to get these chars out. i used this piece of code:

Do Until InStr(1, temp, "#") = 1
temp2 = temp2 + Chr(Val(Mid(temp, 1, InStr(1, temp, "#") - 1)))
temp = Mid(temp, InStr(1, temp, "#") + 1, Len(temp) - InStr(1, temp, "#"))
Loop

In fact, the # is seperating ascii codes which all together form a string being recovered here, where temp2 holds the final string, and temp the one with ascii and strings. problem is that this code takes ages to run thru even if the length of temp is 'only' 200000 characters, which is rather small. any ideas how to speed this up, any faster alternatives available?
 
You may want to check out the instr function.

InStr Function
Returns the position of the first occurrence of one string within another.

InStr([start, ]string1, string2[, compare])

Arguments
start

Optional. Numeric expression that sets the starting position for each search. If omitted, search begins at the first character position. If start contains Null, an error occurs. The start argument is required if compare is specified.

string1

Required. String expression being searched.

string2

Required. String expression searched for.

compare

Optional. Numeric value indicating the kind of comparison to use when evaluating substrings. See Settings section for values. If omitted, a binary comparison is performed.

The InStr function returns the following values:

If InStr returns
string1 is zero-length 0
string1 is Null Null
string2 is zero-length start
string2 is Null Null
string2 is not found 0
string2 is found within string1 Position at which match is found
start > Len(string2) 0


Transcend
[gorgeous]
 

Give This A Try...
[tt]
Option Explicit

Private Sub Command1_Click()

Dim MyArray() As String, S As String
Dim ByteArray() As Byte, Cntr As Long, MaxCntr As Long

S = "65#66#67#"
MyArray = Split(S, "#")
MaxCntr = UBound(MyArray)
ReDim ByteArray(MaxCntr - 1)

For Cntr = 0 To MaxCntr - 1
ByteArray(Cntr) = Val(MyArray(Cntr))
Next Cntr

S = StrConv(ByteArray, vbUnicode)

MsgBox S

End Sub
[/tt]

and see if it is any faster

Good Luck

 
I think you just need to understand that VB processes large strings slowly, and also that string functions are slow. Also using ThisString + ThatString is slow as well. There are a lot of unneccessary operations going on there. If I use large strings on my 400 Celeron I never make them larger than 10-15 thousand chars.

I'm guessing you are getting your string from a file and if these are ASCII characters you can use a byte array and simple numerical operations to perform the task.

This code omitted 25,000 #s from a 200,000 ASCII characters in .15 seconds. The algorithim goes faster the more #'s there are. You just need to convert the final results back to a string if necessary when its done.. And redesign it to process chunks...

Private Type StrChunk
Bytes(10000) As Byte
End Type

Private Sub OmitChr()
Dim StrChunk As StrChunk
Dim OmitChar As Byte
Dim tmpStr(10000) As Byte
Dim sDex As Long
Dim tmpDex As Long
Dim tClock As Double

'make a random string
For sDex = 1 To 10000
StrChunk.Bytes(i) = Int(Rnd * 255)
Next

'add 2500 pounds
For sDex = 1 To 2500
StrChunk.Bytes(Int(Rnd * 10000)) = Asc("#")
Next

OmitChar = Asc("#")

'time 200,000 loops
tClock = Timer

For oDex = 1 To 20
For sDex = 0 To 10000

If StrChunk.Bytes(sDex) <> OmitChar Then
tmpStr(tmpDex) = (StrChunk.Bytes(sDex))
tmpDex = tmpDex + 1
End If

Next

tmpDex = 0

Next

Me.Print Timer - tClock

End Sub

Private Sub GetChunk()

Get #1, , StrChunk

End Sub
 
While you say that your string &quot;only&quot; contains 200000 characters, you have to consider the fact that VB needs to reallocate your string each time it grows, which will take longer as the string grows. And since there will be a LOT of reallocating done in your case, things get rather slow.


Greetings,
Rick
 
Did you try the Replace function:

Dim strnew As String
Dim strshort As String
Dim temp
strshort = &quot;*&quot;
strnew = Replace(strLong, strshort, vbNullString)

When I ran that routine against a 1.6 million character string containing 160,000 '*' it removed them in less than 1 second
________________________________________________________________
If you want to get the best response to a question, please check out FAQ222-2244 first

'People who live in windowed environments shouldn't cast pointers.'
 
Thanks to all of you, I have found it quite fast to Split it into an arraw and process the arrey afterwards then..
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top