Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

clean a txt or html code with vbs

Status
Not open for further replies.

Yli

Technical User
Sep 22, 2009
11
ES
hi all i need help with vbs. i don't have any ideea about coding.
the script must to the next
- function to search email address in txt or html code
the clean these text and leave only the email address.
i need this script to insert it in one program.

pls help.

 
I'd say that the best way to handle this would be using Regular Expressions.

If you'd like a more specific code related answer you might have to tell us exactly where the text to search is coming from and what exactly you want to do with the text you find. Actually extracting the e-mail address(es) from the text is probably one of the easier parts of this problem.

Regards



HarleyQuinn
---------------------------------
Carter, hand me my thinking grenades!

You can hang outside in the sun all day tossing a ball around, or you can sit at your computer and do something that matters. - Eric Cartman

Get the most out of Tek-Tips, read FAQ222-2244: How to get the best answers before post
 
Search for the @ character and you have the email address(es).
All you have to do is open the file, read a line from it, search the line for @, write the email address to another text file, close the files when done.
 
ettienne, that would work assuming the email address is alone on the line. If there is other text with the address, it would pull too much.

RegEx is definitely the way to go. I have never done it in VBS (only in Perl) but I can't imagine it's much different.

Thanks,
Andrew

[smarty] Hard work often pays off over time, but procrastination pays off right now!
 
thanks all for reading my post but i already have the solution.
here it is.


Function Main(strHTMLText)

dim strSearchFor, nPos, strResult

Set RegularExpressionObject = New RegExp

With RegularExpressionObject
.Pattern = "([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})"
.IgnoreCase = True
.Global = True
End With

Set objMatches = RegularExpressionObject.Execute( strHTMLText )

strResult = " "

For Each objMatch in objMatches
strSearchFor = " "+objMatch.value + "; "
If ( InStr ( strResult, strSearchFor ) = 0 ) Then
strResult = strResult + objMatch.value + "; "
End If
Next

Set RegularExpressionObject = nothing

strResult = Trim ( strResult )

Main = strResult

End Function
 
Like I said, I am not too familiar with VBS RegEx sytnax, but it looks like your search string could miss a lot of special characters:

Code:
[a-zA-Z0-9_\-\.]+

What I think this is saying is....

Match 1 or more of the following characters:
a-z
A-Z
0-9
- (hyphen)
_ (underscore)
. (period)

What if someone's email address has a special character in it, like

john.doe^23@company.com

Wouldn't the code above miss the ^ ?

Thanks,
Andrew

[smarty] Hard work often pays off over time, but procrastination pays off right now!
 
Yes, it would miss

! # $ % & ' * + / = ? ^ ` { | } ~

all of which are theoretically valid characters in the local part of an email address. The pattern presented looks to be a validator for Hotmail addresses, which do not allow all of the characters designated in the relevant RFC
 
Email addresses do not have spaces in them right? So if you search for @ and then find the space before and after the @ you will be able get the email address from the text.
 
>Email addresses do not have spaces in them right?

Well, actually the RFCs allow for a quoted string for the local part of address, and quoted strings can have spaces. But it is admittedly rare.

>So if you search for @

There is an assumption there that @ will only be present in the file in an email address. This might not be the case.
 
hmm. anybody can help me to modify the previous code to clean all link tags <a href:*>INEEDONLYTHIS</a>?

i have tried something but how i said not ideea about regex and vbs. just intuition. thanks
 
[tt] Function Main[blue]_2[/blue](strHTMLText) 'clean a-tag only

dim strSearchFor, nPos, strResult

Set RegularExpressionObject = New RegExp

With RegularExpressionObject
[blue].Pattern = "<a[^>]*>([\s\S]*?)</a>"[/blue]
.IgnoreCase = True
.Global = True
End With

[blue]strResult = RegularExpressionObject.replace(strHTMLText," $1 ")[/blue]

Set RegularExpressionObject = nothing

'strResult = Trim ( strResult )

Main[blue]_2[/blue] = strResult

End Function
[/tt]
 
thanks... i will test.. and comunicate... uff it is so bad to don't know many things... :)
 
hmmm. i don't know why but testing it don't work...
just an error without message. ? something missed?
i saw the regexp is working

but after this... i think is something wrong.
so
i tested to replace <a href=" Link</a> to Google Link.
 
sincerly i don't know what type of function i need to use..
my ideea is have a lot of link in my html code and i want to clean them all leaving only the linked text.
 
This is a complete vbs on its own.

[tt][maroon] Function Main_2(strHTMLText) 'clean a-tag only

dim strSearchFor, nPos, strResult

Set RegularExpressionObject = New RegExp

With RegularExpressionObject
.Pattern = "<a[^>]*>([\s\S]*?)</a>"
.IgnoreCase = True
.Global = True
End With

strResult = RegularExpressionObject.replace(strHTMLText," $1 ")

Set RegularExpressionObject = nothing

'strResult = Trim ( strResult )

Main_2 = strResult

End Function

s="i tested to replace <a href=""[ignore][/ignore]"">Google Link</a> to Google Link."
wscript.echo main_2(s)[/maroon][/tt]
 
i get this when i use Parse button ...

Error Source: Microsoft Jscript compilation error
Error description: Expected ';'
Error on line 1
Error on Column: 9
Error in String "Function Main_2(strHTMLText) 'clean a-tag only"


from your post i removed last 2 lines because i think are not necessary.
i use this in to a program wich have transformation script wizard.
 
You're using jscript? Then why you present "a solution" to the forum in your previous post in vbscript?
 
:( because in how i said i don't know what i use.
on this program wich i want to use i have a drop down box wich tell Java Script or VBScript....
so because i don't... i think for that
 
 http://paraglidetv.com/clip.jpg
[tt][maroon]function main_2(strHTMLText) { //clean a-tag only
var RegularExpressionObject = new RegExp("<a[^>]*>([\\s\\S]*?)</a>","gi");
var strResult=strHTMLText.replace(RegularExpressionObject," $1 ");
return strResult
}

var s="i tested to replace <a href=\"[ignore][/ignore]\">Google Link</a> to Google Link.";
alert (main_2(s));[/maroon]
[/tt]
But then you still need to know quite a bit on how it merges into a browser and event handling. Can't help with that distance in view.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top