ignorableWhitespace() was only defined for use with DTDs. Sun's 
implementation may be doing something for XSD but there's nothing in the 
specification which requires that. Xerces is behaving correctly.

Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org

"Zhu, Joe" <joe....@lmco.com> wrote on 09/15/2014 09:41:33 AM:

> Michael,
> Thanks for your reply. The XSD does not allow mixed content. 
> Attached is my test Java code, test xml and test xsd for your reference. 

> 
> Also included below is the run log for Xerces parser and for a Sun 
parser. 
> When it runs with the Xerces parser, the whitespaces are reported in
> the characters() method and nothing is reported in 
ignorablewhitespaces(). 
> But when it runs with the Sun parser, the text content is reported 
> in characters() and the whitespaces are reported in 
> ignorablewhitesapces() method, as expected.
> 
> Joe
> 
> ------------------------ Log for Xerces parser 
---------------------------
> factory = org.apache.xerces.jaxp.SAXParserFactoryImpl@110c424
> parser = org.apache.xerces.jaxp.SAXParserImpl@1bd2664
> startElement howto
> characters = "
>   "
> startElement topic
> characters = "
>       "
> startElement title
> characters = "Java"
> endElement title
> characters = "
>       "
> ...
> 
> ---------------------- Log for Sun parser 
---------------------------------
> factory = 
com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl@1e8a1f6
> parser = com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl@1e152c5
> startElement howto
> ignorableWhitespace = "
>   "
> startElement topic
> ignorableWhitespace = "
>       "
> startElement title
> characters = "Java"
> endElement title
> ignorableWhitespace = "
>       "
> ...
> 
> 
> -----Original Message-----
> From: Michael Glavassevich [mailto:mrgla...@ca.ibm.com] 
> Sent: Friday, September 12, 2014 9:54 AM
> To: j-users@xerces.apache.org
> Subject: EXTERNAL: Re: SAX Parser includes ignorable whitespaces in 
> the character() method
> 
> Your XML document requires a DTD with element declarations which 
> specify that they contain element-only content. Without that a SAX 
> parser cannot determine which whitespaces are 'ignorable'.
> 
> Thanks.
> 
> Michael Glavassevich
> XML Technologies and WAS Development
> IBM Toronto Lab
> E-mail: mrgla...@ca.ibm.com
> E-mail: mrgla...@apache.org
> 
> "Zhu, Joe" <joe....@lmco.com> wrote on 09/11/2014 07:00:11 PM:
> 
> > I am writing an app which need to access all text content in XML. 
> > According to the ContentHandler API, this could be accomplished by 
> > using a validating parser and the characters() method.
> > 
> > But with the Xerces parser, the characters() method could contain 
> > ignorable whitespaces (XML formatting whitespaces). I have no way to 
> > tell if the whitespace is ignorable whitespace or is part of the XML
> content.
> > 
> > Has anybody else run into the problem? I tested with both Xerces 2.
> > 9.1 and Xerces 2.11. They behave the same way.
> > 
> > Joe Zhu
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-users-h...@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to