Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations sizbut on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Exporting Formatting Code from Word

Status
Not open for further replies.

BlueScr33n

Instructor
Feb 10, 2003
78
US
I'm hoping to create a format translation program that goes from a "clean" Word document (aka no named styles) to a plain text document containing the document's text as well as certain formatting tags.

Example before translation:
Heading
Sample paragraph test text.

Example after translation:
{h2}Heading
{para}Sample paragraph test text.

I have tried going out to Word HTML and RTF, this has however grown to be a very complex way of doing things.

My main question: would this be called a format conversion, translation or something different -- I've tried searching for a 3rd party application that would help with this, but I have a feeling I've been asking google the wrong questions.

Any pointers towards a better direction or application would be appreciated.

Kind regards,
- Thomas
 
It would be FAR better to do this from a properly strictured Word document - that is, WITH named styles. But only named styles.

Gerry
 
Initially, I had thought it would be better to strip out named styles to reduce the amount of variance I would have to handle. Currently, when I receive a Word document, I first save down to RTF, open that RTF in wordpad and saving (to ensure named styles are truly gone) and then re-opening that RTF in word. This ensures that instead of looking for BobsHeadingLevel1, Word would return "bold, centered, 14 point font, Times New Roman."

It would be great if I could count on receiving properly structured documents, however in my environment that's probably never going to be the case.

What I'm trying to do now is, after saving that RTF file back into a Word document, I'm re-saving as filtered html and then trying to do a transform on that (Office HTML Filter 2.0 has been handy to have).

Ideally, if there was a way to do a translation on word specific tagging (a one to one match, while time consuming, would be fantastic).

- Thomas
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top