Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

LF gets converted to CRLF when writing to XML file

Status
Not open for further replies.

MakeItSo

Programmer
Oct 21, 2003
3,316
DE
Hi friends,

Not sure whether this is an XML or a VB issue, but I'm having hopes this is the right place. [smile]

Background: I need to convert a plain text file into XML. Some of the text portions in the text file consist of one line, some of many, lines wrapped by CRLF. Transfer works just fine in any case.
Problem: Some very few portions contain a single LF character as line break. Unfortunately those get converted to CRLF when writing them to the XML file - but I need them to stay as is!

Here's the crucial part: the variable holding the text portion is called Source, the XML object is called XLF, I am using MSXML.DOMDocument60:
Code:
With XLF
    .validateOnParse = False
    .resolveExternals = False
    .async = False
    [b].preserveWhiteSpace = True[/b]
    .setProperty "ProhibitDTD", False
    If Not .Load(LoadResData("XLF", "CUSTOM")) Then
        MsgBox "Konnte XLIFF nicht laden!" & vbNewLine & "Fehler: " & vbNewLine & _
        .parseError.reason & vbNewLine & "Zeile: " & .parseError.Line & "Pos: " & .parseError.linepos
        End
    End If
End With

So, preserveWhitespace is set to TRUE, as required. I've tried two version to assign the text:
Variant 1:
Code:
Set S = XLF.createElement("source")
S.Text = Source

Variant 2:
Code:
Set S = XLF.createElement("source")
Set ST = XLF.createTextNode(Source)
S.appendChild ST

In both cases, the resulting XML file contains the text with CRLF instead of LF.
Note: the line endings shall be CRLF, no sweat there; but if the text contains a single LF character, that shall be preserved - alas it isn't. [sadeyes]

P.S: I read the text using FSO:
Code:
 Set ts = fso.OpenTextFile([pathtomyfile], ForReading)
inh = ts.ReadAll

The content of Source is then a portion of inh determined by InStr positions...

Do you have any ideas on this? Any alterations to the text file contents are prohibited, else the contents won't make it back into the originating system after processing.

Thanks & regards,
MakeItSo

“Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.” (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
Solved it!
[flip]
I narrowed it down to MSXML. The conversion of LF into CRLF is truly caused by MSXML.
So I managed to solve it this way:
1.) Protect the linefeeds the moment you've loaded them into the VB String variable:
Code:
inh = ts.ReadAll
[b]inh = ProtectLF(inh)[/b]

[rest of the code]

--------------------
Function ProtectLF(What As String) As String
What = Replace(What, vbCrLf, "*foobar*")
What = Replace(What, vbLf, "!!LF!!")
What = Replace(What, "*foobar*", vbCrLf)
ProtectLF = What
End Function

2.) process the text as usual, write the XML.

3.) After saving the XML, read it with fso and restore the linefeeds:
Code:
XLF.save tmp
Set ts = fso.OpenTextFile(tmp, ForReading, False, TristateMixed)
inh = ts.ReadAll
ts.Close
Kill tmp
[b]inh = Replace(inh, "!!LF!!", vbLf)[/b]
Set ts = fso.OpenTextFile(tmp, ForWriting, True, TristateMixed)
ts.Write inh
ts.Close

I know, it's kind of a crutch but it does the job.
However, if any of you know how to make MSXML preserve linebreaks whatever combination of cr and/or lf they may be, please do respond. I'm sure others might find it useful.
[smile]
Thanks & cheers,
MakeItSo

“Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.” (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top