Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MS Word lists to text

Status
Not open for further replies.

JustinEzequiel

Programmer
Jul 30, 2001
1,192
PH
Am asked to convert lists in MS Word docs (actually RTF files) to text preserving text wrapping and hanging indents.

Code:
    1. First item goes all the way to end and wraps
       here for the second line followed by next number
    2. Second point goes all the way to end and wraps
       here for the second line

Save as Text does not preserve text wrapping.
WordPort can preserve text wrapping but not hanging indents.

Am considering, if I have to do this from MS Word:
Range.Information(wdFirstCharacterColumnNumber)
Range.Information(wdFirstCharacterLineNumber)
and ConvertNumbersToText(...)
but that approach seems to be a lot of work.

Set up a "Generic / Text Only" printer but the second line of each para went all over the place (first few words near the end of line 2, the rest followed in increasing (from the original) indents for 3 extra lines.

How would you approach this? Has anybody else done something similar?

BTW, the text output if for an HL7 interface.

Thanks
 
Hi JustinEzequiel
Obvious question One: Why do you have to retain the same wrap? I speak as an ex-lab person who has written an HL7 interface including RTF files. Wrapping is a virtual concept based on a variety of things including page, size and font at the time etc. If one user chooses to use one page width, that document will look different on the screen to someone with another page width. You also need to consider proportional vs non-proportional fonts which will cause problems. What about mixed font on a line - converting that as requested will produce a very odd looking output.

Obvious question Two: Have you confirmed that the receiving system cannot receive rtf files?

I would argue that this is an unreasonable and unnecessary requirement. As long as the meaning of the text is retained why does it matter? What are you going to do if there are tables embedded too? Or images?! I actually have more of an issue where Histopathologists use attributes such as bold, underlining large font etc. to emphasise points in the WYSIWYG environment and then lose them in a plan text one. In that situation one could consider that there is a real loss of information.

I know I'm not answering your question, but often in this forum I have received the answer I need rather than the one I want!

Simon Rouse
 
Hi Justin,

1.) I second Simon's post.

2.) I just googled up this "HL7". Is my assumption correct that this converted textual information is going to be transformed into CDA format?

If so, then why not transform the Word document directly into CDA?
With a proper XML schema and the MSXML DOM model, you should be able to directly translate a - properly formatted - Word document into CDA-compliant xml.
[ponder]

Just a thought.

Regards,
MakeitSO

[navy]"We had to turn off that service to comply with the CDA Bill."[/navy]
- The Bastard Operator From Hell
 
Dear Simon and MakeitSO,

I am too many layers away from the client receiving the HL7
to be sure but it appears that the only interface we have to
their system is via HL7.

Anyway, thinking about it more yesterday after I posted, I
guess that breaking exactly where the report breaks in MS
Word is not a requirement but I have to ask to be sure.

If this is so then my preferred option now would be to do my
own breaking during HL7 generation.

Thanks,
Justin
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top