I have a situation where I'm trying to turn word documents and forms into XML. I would like to be able to use the text output of a word document with regular expressions in order to determine the format of my XML document.
I've done a reasonable enough job with the paragraphs and headers, etc, but I get lost in any embedded tables in the document. Which leads to my question:
Does anyone know of a good resource for what ANSI or UNICODE values MSword uses in various scenarios (i.e. which values precede an embedded table column?) for formatting?
Any other input on how to do this would be great!
Thanks in advance!
B.J.
I've done a reasonable enough job with the paragraphs and headers, etc, but I get lost in any embedded tables in the document. Which leads to my question:
Does anyone know of a good resource for what ANSI or UNICODE values MSword uses in various scenarios (i.e. which values precede an embedded table column?) for formatting?
Any other input on how to do this would be great!
Thanks in advance!
B.J.