Hi Rob,

Whitespace outside an element is inside of another one (except for
whitespace outside of the root element). Whether this whitespace is
"ignorable" depends on your application and/or whether you have a grammar
which declares that the content of an element is only other elements.

The "include-ignorable-whitespace" and "element-content-whitespace"
features have the same behaviour, however they only apply to DTDs. If you
have no DTD then I suggest that you use an LSParserFilter. This has come up
before on this list. May want to take a look at the previous discussion [1]
in the archives.

Thanks.

[1] http://marc.info/?t=115874050200003&r=1&w=2

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

Rob Davis-5 <[EMAIL PROTECTED]> wrote on 12/09/2008 11:55:06 AM:

>
> I want to be able to filter any whitespace or carriage return types
outside
> of xml elements.
>
> I need this to be able to successfully use W3C DOM method
Node.isEqualNode()
> to compare the elements and attributes of Documents with identical
elements
> and attributes but which have differing amounts of white space - e.g.
> indentation is different, tabs instead of spaced, or documents produced
on
> different platforms Unix / Windows where carriage-return, line feeds
vary.
>
> This relates to the thread
> http://www.nabble.com/How-to-compare-Documents--Existing-library-
> method-available--or-use-DOMTreeWalker--td20856968.html
>
>
> If I use isEqualNode on 2 Documents that have identical Elements and
> Attributes but which have whitespace that varies (as described above),
the
> Documents are still regarded by isEqualNode as different.
>
> I have done some searching and found three (3) options:
> 1) http://apache.org/xml/features/dom/include-ignorable-whitespace
feature
> setting - but this is applicable to
> javax.xml.parsers.DocumentBuilderFactory - I am using and wish to remain
> using LSParser which is produced by DOMImplementationLS which does not
have
> the setFeature method.
>
> I think the problem here is that LSParser is written to comply with W3C
DOM
> interfaces and DocumentBuilderFactory is JAXP interfaces: how can I
connect
> the two so that I configure the LSParser via the DocumentBuilderFactory
> setFeature method?
>
>
> 2) The LSParser is configured with a DOMConfiguration instance, and there
is
> an option:
>  "element-content-whitespace"
>
>     true
>         [required] (default)Keep all whitespaces in the document.
>     false
>         [optional] Discard all Text nodes that contain whitespaces in
> element content, as described in [element content whitespace]. The
> implementation is expected to use the attribute
> Text.isElementContentWhitespace to determine if a Text node should be
> discarded or not.
>
> BUT I'm not concerned with whitespace *within* the elements. I'm only
> interested in whitespace outside of elements, as explained above.
>
>
> 3) LSParserFilter interface - this seems like the most suitable solution
but
> I have seen *NO* implementations of this interface searching the web. I
also
> have bought O'Reilly Java and XML book edition 3 and there is no mention
> here either.
>
>
> Thoughts please on the above. Thanks.
>
> --
> View this message in context: http://www.nabble.com/Filtering-
> whitespace-outside-of-xml-elements-using-LSParserFilter-
> tp20918689p20918689.html
> Sent from the Xerces - J - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to