participate


Java Technology & XML - setIgnoringElementContentWhitespace
<<   Back to Forum  |   Give us Feedback
This topic has 3 replies on 1 page.
JoubertA
Posts:5
Registered: 12/1/98
setIgnoringElementContentWhitespace   
Oct 2, 2003 8:24 AM

 
Hi,

This has come up several times before, but I cannot persuade the DOM parser to ignore whitespace in an XML document. I've read through all the previous queries on the problem, and I

1. validate against a schema (this works, as it complains if I change the XML)
2. use xerces 2.5, which ought to validate against schemas correctly
3. set setIgnoringElementContentWhitespace(true)
4. in the schema explicitly set mixed="false" in every complexType definition

I've modified the DomEcho02 example from the tutorial, and it still shows me text nodes around every child node. Has anybody got any ideas why this is not working the way it should?

The code setting up the factory is:

      if (argv.length != 2) {
        System.err.println("Usage: java DomEcho filename schema");
        System.exit(1);
      }
 
      DocumentBuilderFactory factory =
        DocumentBuilderFactory.newInstance();
      factory.setValidating(true);
      factory.setNamespaceAware(true);
      factory.setIgnoringComments(true);
      factory.setIgnoringElementContentWhitespace(true);
      factory.setAttribute(
        "http://java.sun.com/xml/jaxp/properties/schemaLanguage",
        "http://www.w3.org/2001/XMLSchema");
      try {
        factory.setAttribute(
          "http://java.sun.com/xml/jaxp/properties/schemaSource",
          new InputSource(new FileInputStream(argv[1])));
        DocumentBuilder builder = factory.newDocumentBuilder();
        document = builder.parse(new File(argv[0]));
        makeFrame();
 
      } ....


I'd appreciate any help on this one!

Adriaan
 
dvohra09
Posts:3,591
Registered: 4/4/01
Re: setIgnoringElementContentWhitespace   
Oct 6, 2003 11:44 AM (reply 1 of 3)  (In reply to original post )

 
http://forum.java.sun.com/thread.jsp?forum=34&thread=430252
 
JoubertA
Posts:5
Registered: 12/1/98
Re: setIgnoringElementContentWhitespace   
Oct 6, 2003 10:20 PM (reply 2 of 3)  (In reply to #1 )

 
Just an update on what I did in the end: I've written a small recursive function that traverses the tree and throws out all empty text nodes. Not a nice solution at all - especially considering the inefficiencies of building a DOM tree and then throwing half of it away again.

From the documentation I still conclude that the white space text nodes should not be generated - after all, the schema does not allow them. So my conclusion is that this simply does not work in xerces yet. Have to try again with the next version...
 
cayhorstmann
Posts:48
Registered: 9/11/97
Re: setIgnoringElementContentWhitespace   
Sep 3, 2004 7:22 PM (reply 3 of 3)  (In reply to #2 )

 
This has been a frustrating issue for me as well, and it has been no joy to wade through lots of comments of the kind "just skip past the blank text nodes".

Here is how I got it to work in JDK 5.0 RC.

1) Get a DOMImplementation the non-portable way:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
DOMImplementation domImpl = builder.getDOMImplementation();

(If you call

DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementation domImpl = registry.getDOMImplementation("XML 3.0");

then you lose in step 2--the standard domImpl can't get the "LS" feature--how lame is that?)

2) Make a LSParser

DOMImplementationLS ls = (DOMImplementationLS) domImpl.getFeature("LS", "3.0");
LSInput in = ls.createLSInput();
in.setByteStream(new FileInputStream(filename));
LSParser parser = ls.createLSParser(DOMImplementationLS.MODE_SYNCHRONOUS, "http://www.w3.org/2001/XMLSchema");

3) That's it--parse, and it just works. No more white space!

Document doc = parser.parse(in);

It is completely incomprehensible to me why the JAXP parser can't do this correctly. Amusingly enough, the bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4867706, reported in May 2003, is marked as a "request for enhancement".

Cheers,

Cay
 
This topic has 3 replies on 1 page.
Back to Forum
 
Read the Developer Forums Code of Conduct

Click to email this message Email this Topic

Edit this Topic
  
 
 
Forums Statistics
    Users Online : 28
  • Guests : 133

About Sun forums
  • Oracle Forums is a large collection of user generated discussions. It is here to help you ask questions, find answers, and participate in discussions.

    Check out our guide on Getting started with Oracle Forums for a full walkthrough of how to best leverage the benefits of this community.

Powered by Jive Forums