Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular expressions help...

Status
Not open for further replies.

MrTrue

Technical User
Jul 28, 2008
46
US
Ok, so I'm not the best with Regex and I need some help! I'm looking for this pattern:

MEMBER NAME: DOE*JOHN

My thought was 3 sub-strings because all I really want is the name portion... this is what I have so far...
Code:
searchValue = "(MEMBER\sNAME\:\s*)([A-Z]\*[A-Z])(\W)"
I decided first substring is MEMBER NAME: which is always the same, second substring is the name (Last*First) combination, and third would be a non-word character of some sort (space, number, bracket, etc.)

Note there could be multiple spaces between the colon and the beginning of the name. Can anyone help me figure out whats wrong with my syntax? Thanks!!!
 
Try this:
Code:
"(MEMBER NAME:\s*)([A-Z]*\*[A-Z]*)(\W)?"

The pattern you currently have is only looking for <one letter>*<one letter> for the name portion, you need to add some quantifiers (*, ?, {1,}, etc) to get all the characters.
 
Depending on your data, you might also consider looking for other characters in names. The expression in the current form will not match a name such as "O'Donnell".

Here's an alternate that saves the last name to group 1 and the first name to group 2 (assuming "MEMBER NAME: " is not important).
Code:
"MEMBER NAME: ([A-Z']*)\*([A-Z]*)\W*"
The parentheses around MEMBER NAME and \W were eliminated so they would not be saved to groups. Also, \W? was changed to \W* to allow any number of optional non-word characters instead of just 1.
 
Ok, it's not quite working but I think I may know why... The information I'm pulling in is coming in as HTMLString as follows...

Code:
style='font-size:8.0pt;font-family:"Courier New"'>MEMBER NAME:&nbsp;
DOE*JOHN<o:p></o:p></span></font></p>

or

style='font-size:8.0pt;font-family:"Courier New"'>MEMBER NAME:&nbsp;
DOE*JOHN D<o:p></o:p></span></font></p>

So I'm assuming the return in the code is throwing it off... (extra white space maybe) as well as the HTML &nbsp; that is spelling out the space? What if there are multiple &nbsp; Also if the last name is 2 part... VON DOE is there any way to work around that? I'll keep experimenting, but I appreciate all help! :)
 
If all you really want is the name, try this:
Code:
"([A-Z' ]*)\*([A-Z]*)"
 
I suppose I might be over analyzing by adding the other stuff, I thought I'd need to accomodate for the situation where a random asterisk might be keyed in somewhere and that's why I had planned on using the "MEMBER NAME" as kind of a validator. I may need to fall back on the 99% rule though... One question though, when I try the
"([A-Z' ]*)\*([A-Z]*)" It keeps returning only a single asterisk... Any thoughts?
 
What is the exact input you giving it when it returns only a single asterisk?
 
I'm passing a string of HTML I pulled from an outlook message into a function. I've done this successfully with other regex patterns before, just not sure what I'm doing wrong...

Code:
Do Until isFinished = True
    If re.test(theBody) Then
        With re
            .Pattern = searchValue
            .IgnoreCase = True
            Set MyMatch = .Execute(CStr(theBody))

        End With
        If MyMatch.Count > 0 Then
            Set FirstMatch = MyMatch(0)
        Else
        End If
        If MyMatch.Count = 0 Then
            isFinished = True
        Else
        End If
    Else
    isFinished = True
    End If

Loop
 
Given your 'John Doe' example above, all you get back is an asterisk or are you giving it different input?

I'm curious why you don't loop through the matches?
 
Thanks for all of your help! I wouldn't have gotten on the right path without it! I was able to make it work by using the + qualifier instead of * . The asterisk is the "perhaps some" while this plus is the "some". Here is what I used pretty much exactly what you had with the pluses as substitutes for the asterisks.

searchValue = "([A-Z']+)\*([A-Z]+)"

Thanks again! %-)

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top