Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need Help with XML transformation 1

Status
Not open for further replies.

vj18

Programmer
Jan 6, 2004
1
US
hi,

Help me pls. My task is to replace &quot;&&quot; and &quot;<&quot; so the input XML file becomes well formed. I have 32,000 XML files to work on.
Files look like this :

<book>
.
.
.
.<Abstract>
<AbstractText> this book explains formula a<0.01 and some basic science.</AbstractText>
</Abstract>
.
.
.<source>Texas A & M University</source>
.
.
</book>


I want to replace special characters like < and & with < and & in all files. Can anyone pplease suggest me how this can be achieved? I tried to use XSLT but it only accepts well formed documents. please help me out !! :(
Thanks.

VJ
 
If you're on a Unix box, use sed. Use these REs for your script:

Code:
/\&/s//\&_amp;/g
/\</s//\<_lt;/g

Omit the underscores before amp and lt in the above code: if I don't break up the entity strings, the forum software interprets them, and that wrecks the code.
 
Of course, you'd have had to do that before you inserted the XML tags. The search pattern would be more convoluted if you can't do the sed step first.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top