I think I have found a solution, using "element-content-whitespace" set to
false. Based on my existing code derived from O'Reilly Java and XML edition
3, I set the "element-content-whitespace" parameter to false on the
DOMConfiguration instance.

The code fragment from a method below is used for parsing different XML
Documents - i.e. different DTDs which means for some,
"element-content-whitespace" being false may be unnecessary and in fact
possibly change their data undesirably. I most cases I don't want text
elements to have their enclosed data cropped. So I provide it as a flag
option to turn on or off. 

I only need to ignore whitespace for a certain XML Document type where I
need to compare just the elements, attributes child elements of these
Documents - for more detail on this see my response to you after the code
below...


// existing code
                DOMConfiguration config;
                
                DOMImplementationRegistry registry;
                DOMImplementationLS lsImpl;
                LSParser parser;
                
           registry =
                          DOMImplementationRegistry.newInstance( );
                  
                   lsImpl =
                          
(DOMImplementationLS)registry.getDOMImplementation("LS");
                   
                   parser =
                          
lsImpl.createLSParser(DOMImplementationLS.MODE_SYNCHRONOUS,
                          null);
                
          // Set options on the parser
          config = parser.getDomConfig( );
          config.setParameter("validate", Boolean.TRUE);
         
          config.setParameter("error-handler", aDomErrorHandler );
          
// end existing code

// additional code

          if ( ignoreWhitespace )
          {
            config.setParameter("element-content-whitespace", Boolean.FALSE);
          }



Michael Glavassevich-3 wrote:
> 
> 
> Hi Rob,
> 
> Whitespace outside an element is inside of another one (except for
> whitespace outside of the root element). Whether this whitespace is
> "ignorable" depends on your application and/or whether you have a grammar
> which declares that the content of an element is only other elements.
> 

My particular xml Document doesn't care about whitespace at all, it doesn't
have any enclosing elements like <text>....</text> which could contain
whitespace. All I'm interested in is the elements themselves, their
attributes enclosed within the < and /> and their child elements.


Michael Glavassevich-3 wrote:
> 
> The "include-ignorable-whitespace" and "element-content-whitespace"
> features have the same behaviour, however they only apply to DTDs. If you
> have no DTD then I suggest that you use an LSParserFilter. 
> 

A have a DTD defined so I can use these.

The XML document is custom bespoke designed by me for a particular purpose
and I have used the utilities from net.sourceforge.saxon to generate the DTD
from the XML document.




-- 
View this message in context: 
http://www.nabble.com/Filtering-whitespace-outside-of-xml-elements-using-LSParserFilter-tp20918689p20933774.html
Sent from the Xerces - J - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to