Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Byte array, text scanning

Status
Not open for further replies.

cjelec

Programmer
Jan 5, 2007
491
0
0
GB
VB.net: 1.1 2003

Hi,

I'm trying to find keywords in an array of bytes as fast as possible...

I have an array of Strings containing keywords (200-400 items)
And an array of bytes (from a stream) which can be quite large.

I have been converting the byte array to a string and then scanning it for the items in the String Array, but this can be quite slow...

Is there a faster way?

Any help would be appreciated

Thanks
 
You could try regular expressions

Code:
Imports System.Text.RegularExpressions

Public Class yyyyyyy

Private Const KEY_WORDS as String = "KEYWORD1|KEYWORD2|KEYWORD3"
....
....
		Public Shared Function HasKeyWords(ByVal sample As String) As Boolean
			Dim re As New Regex(KEY_WORDS)
			Return re.Match(sample ).Success
		End Function
....
....
End Class

KEY_WORDS could be built up dynamically if you wanted to vary the words you are searching for.

Cheers
Snuv

"If it could have gone wrong earlier and it didn't, it ultimately would have been beneficial for it to have." : Murphy's Ultimate Corollary
 
Thanks for your suggestion Snuv.

I would still need to convert the array to a string before running the regex...

Ok, so the next challenge, each keyword has a value associated to it (some:1,words:2). How could I use this with the regex?

I have just looked at Regex.Matches which returns a collection of all the items found. I could have an array with the keywords in as well as a string ("some|words"), and find each item from the MatchCollection in the string array to get the values...

Thanks again
 
I havent looked into the matches thing.

You might use something along the lines of

for each pair in keyword list
Dim re As New Regex(pair.keyword)
for each string in table
if re.match(string)
do some processing based on pair.value
next
next

you still have to convert to strings but it might still be quicker than checking byte by byte





"If it could have gone wrong earlier and it didn't, it ultimately would have been beneficial for it to have." : Murphy's Ultimate Corollary
 
I have found that matches returns a collection of every item found using the regex string. So in a way it performs the task of For Each ... Next for you, But doesn't offer a way of having the keywords indexed.

I could use something like:
Code:
Dim KeyArray() As String = {"some", "words"}
Dim ValArray() As Integer = {4, 9}

Dim RegString As String = Join(KeyArray,"|")

Dim reg As New Regex(RegString)

Dim result As MatchCollection = reg.Matches(Text)

Dim count As Integer

For Each m As Match In result

  count += ValArray(KeyArray.IndexOf(m.Value))

Next

Thanks again,
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top