Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reformatting HTML document

Status
Not open for further replies.

Guest_imported

New member
Jan 1, 1970
0
I would like to write a program to read an HTML file, and produce an edited version of this HTML file with my own tags entered. Can anyone help me, by giving me a direction to start this project. Basicaly, the HTML progrma will always come with the same <PRE> format. I need to write a program, that if a certain file with wild card characters in the name exists, it will take the file, read it, format it, and produce a new HTML file in it's place leaving a copy of the old one with another extension.
 
Do you need (or want) to write this in C? The kind of text processing involved in that would be much easier to accomplish in other languages (like Perl).

The hardest part will be the parsing. If you want to do some fairly complex stuff in terms of searching replacing and you want to do it in C, look into Lex or Flex (GNU version) and possibly Yacc or Bison (GNU version), though it sounds like Lex will be sufficient for what you're doing.

If you want a challenge or the parsing is pretty straightforward, you need to write a state machine of some sort. There are many examples of how to do this in tutorials around the net and C books.

The rest of the stuff should be pretty easy (opening files, closing files, creating copies of files etc.), though I don't know how complex the patterns are in the file names you're looking for.

Regards, [sig]<p>Russ<br><a href=mailto:bobbitts@hotmail.com>bobbitts@hotmail.com</a><br><a href= is in</a><br>[/sig]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top