Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

trying to parse out the tags in an HTML file without M$'s Browser ctrl

Status
Not open for further replies.

Karl Blessing

Programmer
Feb 25, 2000
2,936
US
At the moment I have:<br><br><FONT FACE=monospace><br>Private Sub Command1_Click()<br>&nbsp;&nbsp;&nbsp;&nbsp;Dim cmdObj As Object<br>&nbsp;&nbsp;&nbsp;&nbsp;Set cmdObj = CommonDialogAPI<br>&nbsp;&nbsp;&nbsp;&nbsp;With cmdObj<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.DefaultExt = &quot;*.htm*&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.DialogTitle = &quot;Find an HTML&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Filter = &quot;HTML Files(*.htm, *.html)¦*.htm;*.html¦&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.InitDir = &quot;C:\Windows\&quot;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ShowOpen<br>&nbsp;&nbsp;&nbsp;&nbsp;End With<br>If cmdObj.CancelError = False Then<br>&nbsp;&nbsp;&nbsp;&nbsp;Dim fs As New FileSystemObject<br>&nbsp;&nbsp;&nbsp;&nbsp;Dim HtmlF As TextStream<br>&nbsp;&nbsp;&nbsp;&nbsp;List1.Clear<br>&nbsp;&nbsp;&nbsp;&nbsp;Set HtmlF = fs.OpenTextFile(cmdObj.FileName, ForReading)<br>&nbsp;&nbsp;&nbsp;&nbsp;While Not HtmlF.AtEndOfStream<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;List1.AddItem HtmlF.ReadLine<br>&nbsp;&nbsp;&nbsp;&nbsp;Wend<br>End If<br><br>End Sub<br></font><br><br><br>this will basicaly just show me all the lines of the HTML file, what I want to get at is grabing every tag, it's start and end, and put the material inside each tag in between, the goal of this , is often I'm working on updating one of the client's website, and whoever the person was before me, used an editor that didnt same the source formating, so what I get is code from end to end, and would exceed the 2048 charaters a line, and so forth, so I spend about 30 mins (or more depending on the size of the file) just indenting out the entire file, then doing my update, good thing is, once I do it , I dont have to do it again. But it's gotten to those points that they'll be some new files that I havent gotten to before, and yep whenever an update is needed there I have to parse it out. so I want to make a VB app, that'll take an *.htm file, load it into the text stream, and spit it back with vbcrlf, and vbtabs, indenting it. I'll probally make a set of tags that will be indented, or carridged returned, I assuming I may have to use a custom collection. any ideas? <p>Karl<br><a href=mailto:kb244@kb244.8m.com>kb244@kb244.8m.com</a><br><a href= </a><br>Experienced in , or have messed with : VC++, Borland C++ Builder, VJ++6(starting),VB-Dos, VB1 thru VB6, Delphi 3 pro, Borland C++ 3(DOS), Borland C++ 4.5, HTML,Visual InterDev 6, ASP(WebProgramming), QBasic(least i didnt start with COBOL)
 
so far I have this. It'll grab each &quot;&lt;&quot; -&gt; &quot;&gt;&quot; and it'll add any line between that, now I need to be able to get any lines that come before the first tag (assuming that not everyone puts a &lt;HTML&gt; or some other tag at the very beginning) and to be able to grab a tag that may have the &quot;&gt;&quot; starting on the next line.<br><br>after all this going to loop through and find &lt;...&gt; and &lt;/...&gt; so I can start indenting. <p>Karl<br><a href=mailto:kb244@kb244.8m.com>kb244@kb244.8m.com</a><br><a href= </a><br>Experienced in , or have messed with : VC++, Borland C++ Builder, VJ++6(starting),VB-Dos, VB1 thru VB6, Delphi 3 pro, Borland C++ 3(DOS), Borland C++ 4.5, HTML,Visual InterDev 6, ASP(WebProgramming), QBasic(least i didnt start with COBOL)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top