Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to find angle brackets in a string ">" and "<" 1

Status
Not open for further replies.

fdgsogc

Vendor
Feb 26, 2004
160
CA
I am trying to extract an email address from the following string: "Joe Smith" <joesmith@website.ca>

I am trying to use the FIND function in coldfusion to find the first angle bracket. Here is my code:
Code:
<cfset startpos = #find("<",fromemailstring)#>
<cfset endpos = #find(">",fromemailstring)#>

Then I was going to use the MID function to extract the email address in between.

However, the FIND function cannot find the angle bracket.

Suggestions?
 
That should work fine. But due the brackets, the returned string will be treated as html. So you will not be able to view it unless you use HTMLEditFormat.

But if you are using CF8, consider using reMatch instead. It is more elegant.

Code:
<!--- option 1 --->
<cfset fromEmailString = '"Joe Smith" <joesmith@website.ca> sdfsdf'>
<cfset startPos = find("<", fromEmailString)>
<cfset endPos = find(">", fromEmailString)>
<cfif startPos and endPos>
	<cfoutput>
	#HTMLEditFormat(mid(fromEmailString, startPos, endPos-startPos+1))#
	</cfoutput>
</cfif>

<!--- option 2 --->
<cfset result = reMatch("<[^>]+>", fromEmailString)>
<cfdump var="#result#">

----------------------------------
 
You could simply do a replace() to remove the angle brackets. Just replace them with an empty string.

You could also try some reverse logic using reReplace. Remove everything up to the opening bracket "<" and after the closing one ">".

Code:
<cfset result = reReplaceNoCase(fromEmailString, "^([^<]*<)|(>[^>]*)$", "", "all")>
<cfdump var="#result#">

If you want to go the java route, try something like this:
(?<=<) - ie opening bracket
[^>]+ - ie characters after open bracket
(?=>) - ie closing bracket

Code:
<cfset patternString = "(?i)(?<=<)[^>]+(?=>)">
<cfset objPattern = CreateObject("java", "java.util.regex.Pattern").Compile(patternString)>



----------------------------------
 
Thanks very much. I used this version of your solution:

<cfset result = reReplaceNoCase(fromEmailString, "^([^<]*<)|(>[^>]*)$", "", "all")>
 
Hi cfSearching,

May I ask for one more favor. I am receiving emails into a database. But I want to peel away everything after and including the "original message" text:

-----Original Message-----

Sometimes there are more or less dashes from a user depending what email system they are sending from. Can you create a regular expression using reReplaceNoCase that would do this? I am having a hard time understanding how to create regular expressions.

Thanks a ton in advance.
 
Well, regular expressions are definitely something you want to learn :) Try giving this one a shot on your own first. Then post back if you have problems.

Hints:
1. You have already defined the pattern. You just need to convert that into a regex pattern.

Pattern in english:
{series of dashes}{Original[spaces]Message}{series of dashes}

2. Look at the special characters that will help you create that regex expression:

* :zero or more occurrences of previous
+ :eek:ne or more occurrences of previous
\s :white space characters
( ) :group characters

3. Then pair the special characters with the regular characters you want find to create the expression.

-+ :eek:ne or more instances of "-"
\s+ :zero or more instances of whitespace characters
(whole expression) :group the characters

4. Start small. Search for multiple dashes first. Then modify the expression to find multiple dashses followed by "Original Message", etcetera.

Let us know if you run into problems.



----------------------------------
 
OK. I'm part of the way there. I can take the body of an email and remove the "--- Original Message ---" text whether it has 2 or more dashes.

My next challenge is that I want to remove every other characters after the "--- Original Message ---" text.

Not getting how to instruct selecting all text after my selection.

Ideas?
 
I'm able to get rid of all text with this following expression, but how to I limit it to only all text AFTER my "----Original Message----" text?

<cfset cleanbodytext = reReplaceNoCase(bodytext, "[a-z]*", "", "all")>
 
So here's what I ended up with. I think this will work. I'll end up with a few dashes because I'm not using a regular expression to find the number of dashes but rather the words, "Original Message".

Thanks for all your help.

Code:
			<cfset bodytext = #htmlcodeformat(getmail.body)#>
			<cfset regex = "(?=Original Message)">
			<cfset result = reFind(regex, bodytext, 1, "yes")>
			Mid of bodytext = #mid(bodytext, 1, result.pos[1]-5)#
 
Yes, that would work. Though I would play it safe and use case insensitive find (ie reFindNoCase). Though it is probably not needed.

Another option is to include the dashes. Not as elegant as a lookahead, but it should work too.

Code:
<cfset regex = "(-+\s*Original\s+Message\s*-+)">
<cfset result = reFindNoCase(regex, bodyText, 1, "yes")>
<cfif result.pos[1] gt 0>
	<cfoutput>#mid(bodyText, 1, result.pos[1]-1)#</cfoutput>
<cfelse>
	pattern not found in bodyText
</cfif>

----------------------------------
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top