Hi Adrian,

There's lots you haven't said about what work you're actually measuring
(e.g. what your SAX ContentHandler does) or how you're doing it (e.g. are
you running warm-up iterations to let the JIT do its optimizations before
starting to do your timing?), though I suspect at least a good chunk of
what you're seeing is due to the cost of processing the 68 KB schema and
not the actual validation time with it. You should take a look at the
grammar caching capabilities which Xerces has (i.e. load the schema once;
use it many times), in particular the JAXP 1.3 Validation API. See the FAQ
here [1] on how to use the JAXP Validation API as well as this one [2] on
general performance.

Thanks.

[1] http://xerces.apache.org/xerces2-j/faq-pcfp.html#faq-4
[2] http://xerces.apache.org/xerces2-j/faq-performance.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org

Adrian Crum <adrian.c...@yahoo.com> wrote on 04/28/2009 10:56:32 PM:

> Hello all.
>
> I'm trying to convert from DOM parsing to SAX parsing. The basic
> code I'm using is:
>
> DocumentBuilderFactory factory = new org.apache.xerces.jaxp.
> DocumentBuilderFactoryImpl();
> factory.setValidating(validate);
> factory.setNamespaceAware(true);
>         factory.setAttribute("http://xml.org/sax/features/validation
> ", validate);
>
factory.setAttribute("http://apache.org/xml/features/validation/schema
> ", validate);
> DocumentBuilder builder = factory.newDocumentBuilder();
> Document document = builder.parse(inputStream);
>
> for DOM parsing, and
>
> SAXParserFactory spf = new org.apache.xerces.jaxp.SAXParserFactoryImpl();
> spf.setValidating(validate);
> spf.setNamespaceAware(true);
>         spf.setFeature("http://apache.org/xml/features/validation/schema
> ", validating);
> SAXParser parser = spf.newSAXParser();
> parser.parse(inputStream, handler);
>
> for SAX parsing.
>
> Using the same source XML file (that references an XSD, and is about
> 68 KB), the SAX parsing runs 10 times slower than the DOM parsing.
> If I disable the schema validation, the SAX parsing runs faster than
> the DOM parsing - but the objects that are created are missing
> default attributes specified in the XSD. (By the way, the same Java
> objects are created in both scenarios - they have two constructors:
> one for DOM and one for SAX.)
>
> Ideally, I would like to configure the SAX parser to just use the
> XSD to supply default attribute values, and not use it for validation.
>
> I've Googled and searched the Xerces website for an answer, but I
> didn't find one. I need the SAX parsing to run as fast or faster
> than the DOM parsing with validation turned on.
>
> Can anyone help me?
>
> -Adrian
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to