Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

getChildNodes()..problem in XML parsing

Status
Not open for further replies.

patnim17

Programmer
Jun 19, 2005
111
0
0
US
Hi,
I am parsing a simple XML document using jaxp.
the XML structure is like this:
<employees>
<employee></employee>
<employee></employee>
<employee></employee>
</employees>

After loading the xml, I have the following code:
sLength=doc.getDocumentElement().getChildNodes().getLength();
sEmpNode=doc.getDocumentElement();
for( i=0;i<sLength;i++){
System.out.println(sEmpNode.getChildNodes().item(i).getNodeName());
}

I was expecting 3 nodes to be displayed, namely "employee"..but it displays 7 like this:
#text
employee
#text
employee
#text
employee
#text


I don't understand why it displays #text and what the hell is that?

nims
 
This is because the white spaces between elements are consider text node. If you remove all the white space between nodes, then #text will disappers.

Alternatively you can turn ignore-whitespace on at the DocumentBuilderFactory.


Code:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setIgnoringElementContentWhitespace(true);
....
 
Well I tried using dbf.setIgnoringElementContentWhitespace(true); but it doesn't work. All I want is to get the FirstName value from this XML:
<Employees>
<Employee>
<FirstName>John</FirstName>
</Employee>
<Employee>
<FirstName>Mike</FirstName>
</Employee>
</Employees>

Note that this no CDATASection in this xml.

nims
 
I guess setting just ignore the white space in the content, but still create the text node. I'm pretty sure if you remove all spaces and line spacing between each element (i.e. make whole xml into one line), you can then elimiate all #text nodes.

Or if you are using xerces parser, you can have turn off the "include-ignoreable-whitespace" feature by


dbf.setFeature(" false);

it will then not include whitespace only text node in the DOM tree. For reference
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top