Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

VBS Script - Text File

Status
Not open for further replies.

NickC111

Technical User
Sep 23, 2010
44
GB
Afternoon folks

I am a novice with scripts but I have a text file that contains standard text but has html tags within the file.

I want to remove the html tags from the text file. Is there a script I can use to do?
 
'a little tongue in cheek?
strTemp = objTS.ReadAll
strTemp = Replace(strTemp, "<", "")
strTemp = Replace(strTemp, ">", "")
objTSNew.WriteLine strTemp
objTSNew.Close()
 
Thanks

If the text file is called c:\Temp.txt and this is the input file. I want to write a new text file without the html tags as the output file.

 
'it is rather lacking in audit information and certain amount of error checking?

Dim FSO, objTS, strInputFile, strOutputFile, strTemp
Set FSO = CreateObject("Scripting.FileSystemObject")
strInputFile = "c:\temp.txt"
strOutputFile = "c:\tagless.txt"
If FSO.FileExists(strInputFile) Then
Set objTS = FSO.OpenTextFile(strInputFile, 1, False) '?
strTemp = objTS.ReadAll
strTemp = ClearHTMLTags(strTemp, 1)
objTS.Close
Set objTS = Nothing
Set objTS = FSO.CreateTextFile(strOutFile, True)
objTS.WriteLine strTemp
objTS.Close
Set objTS = Nothing
Else
'?
End If
Set FSO = Nothing

Function ClearHTMLTags(strHTML, intWorkFlow)
'grab the function written by Johann
End Function



 
Thanks

I have tried your script on my example but I get the following error message

Script Error

Line 11
Char 3
Error Invalid Procedure & Arugment Call
800A0005
mICROSOFT VBScript Runtime Error

Any ideas?
 
replace
Code:
Set objTS = FSO.CreateTextFile([s]strOutFile[/s], True)
with
Code:
Set objTS = FSO.CreateTextFile([highlight]strOutputFile[/highlight], True)
 
Thanks the script works but the output file is blank.

It looks like it removes everything from the input file

I enclose the input file for you. I want to be left with the text of the document.
 
Temp File

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.0.6487.1">
<TITLE>FW: VACATION FROM HOSTEL (TEMPORARY ACCOMMODATION):</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->

<P DIR=LTR><SPAN LANG="en-gb"></SPAN></P>

<P DIR=LTR><SPAN LANG="en-us"><FONT COLOR="#000080" FACE="Arial">System Support &amp; Data Control Officer</FONT></SPAN></P>

<P DIR=LTR><SPAN LANG="en-us"><FONT COLOR="#000080" FACE="Arial">&nbsp;</FONT></SPAN></P>

<P DIR=LTR><SPAN LANG="en-us"><FONT COLOR="#000080" FACE="Arial">Portsmouth, Hants PO6 3AU</FONT></SPAN><SPAN LANG="en-gb"></SPAN><SPAN LANG="en-us"></SPAN></P>

<P DIR=LTR><SPAN LANG="en-gb"></SPAN></P>

<P DIR=LTR><SPAN LANG="en-gb"></SPAN></P>

<P DIR=LTR><SPAN LANG="en-gb"><B></B></SPAN><SPAN LANG="en-gb"><B></B></SPAN><B><SPAN LANG="en-us"><FONT FACE="Comic Sans MS">VACATION FROM HOSTEL </FONT></SPAN></B></P>

<P DIR=LTR><SPAN LANG="en-gb"></SPAN><SPAN LANG="en-gb"></SPAN></P>

<P DIR=LTR><SPAN LANG="en-gb"></SPAN><SPAN LANG="en-gb"></SPAN><SPAN LANG="en-us"><FONT FACE="Comic Sans MS">Name:&nbsp; MISS K M TEST</FONT></SPAN></P>

<P DIR=LTR><SPAN LANG="en-us"><FONT FACE="Comic Sans MS">Reference: 200011913</FONT></SPAN></P>

<P DIR=LTR><SPAN LANG="en-us"><FONT FACE="Comic Sans MS">Address of Temporary Accommodation:&nbsp; TEMP ADDRESS, TEMP STREET</FONT></SPAN></P>

<P DIR=LTR><SPAN LANG="en-us"><FONT FACE="Comic Sans MS">Date moved in: 02/06/2010</FONT></SPAN></P>
<BR>

<P DIR=LTR><SPAN LANG="en-us"><FONT FACE="Comic Sans MS">Date vacated:&nbsp; 07/08/2010</FONT></SPAN></P>


<P DIR=LTR><SPAN LANG="en-gb"></SPAN></P>
<BR>
<BR>
<BR>
<BR>
<BR>

<P DIR=LTR><SPAN LANG="en-gb"></SPAN></P>

</BODY>
</HTML>

Standard Text: VACATION FROM HOSTEL as an example
 
A starting point:
Code:
Dim fso, f, RE, str
Const ForReading = 1, ForWriting = 2
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile("c:\temp.txt", ForReading)
str = f.ReadAll
f.Close
Set RE = New RegExp
RE.Pattern = "<[^<]+>": RE.Global = True
str = RE.Replace(str, "")
Set f = fso.OpenTextFile("c:\temp.txt", ForWriting, True)
f.Write str
f.Close

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
It works fine for me, except the parameter for ClearHTMLTags is supposed to be 0, not 1. See the documentation in the link for details

strTemp = ClearHTMLTags(strTemp, 0)

So if you make this change and it's still not working, you will have to show the code that you are using.
 
NickC111 said:
...the script works but the output file is blank.

Because the function is empty
Code:
Function ClearHTMLTags(strHTML, intWorkFlow)
  'grab the function written by Johann
End Function
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top