Thanks for the suggestion. Not sure if that's going to work in my case
though, how will I know which parser to use for which incoming document?

All documents come in through one entry point in a Servlet (so
effectively multithreaded)


Thanks Adam.



-----Original Message-----
From: Bill Michell [mailto:[EMAIL PROTECTED] 
Sent: 10 November 2008 12:03
To: j-users@xerces.apache.org
Subject: RE: Problem validating using XMLGrammarPool with different
Grammars with no namespace

I hit this crazy bit of schema design myself.

If you create one parser per schema document, you can give each one its
own caching grammar pool, and re-use them from document to document.

Of course, things are even more fun if you are parsing from multiple
threads.

It is possible to share the cached grammar pool (which is thread safe)
amongst multiple parsers (which are not thread safe).

My code is ugly, and can certainly be adjusted for your use case:

        /**
         * @param schemaLocation
         * @param preloadGrammars
         * @return a DOMParser for the specified schema
         * @throws IOException
         * @throws XNIException
         */
        public static DOMParser domParserWithPrivateGrammarPool(final
URL schemaLocation, final List<URL> preloadGrammars) throws
XNIException, IOException {
                final XMLGrammarPool grammarPool = new
XMLGrammarPoolImpl();
                if (preloadGrammars!=null) {
        
preloadGrammarsIntoPool(grammarPool,preloadGrammars);
                }
                final XMLGrammarCachingConfiguration config = new
XMLGrammarCachingConfiguration(
                                new SymbolTable(), grammarPool);
                config.setErrorHandler(new GenericXMLErrorHandler());
                if (schemaLocation!=null) {
                        config.setFeature(JAXP_VALIDATION_FEATURE,
true);
        
config.setFeature(JAXP_SCHEMA_VALIDATION_FEATURE, true);
        
config.parseGrammar(XMLConstants.W3C_XML_SCHEMA_NS_URI,
schemaLocation.toString());
                } else {
        
config.setFeature(JAXP_VALIDATION_FEATURE,false);
                }
                return new org.apache.xerces.parsers.DOMParser(config);
        }

        /**
         * @param grammarPool
         * @param preloadGrammars
         */
        private static void preloadGrammarsIntoPool(final XMLGrammarPool
grammarPool, final List<URL> preloadGrammars) {
                final XMLGrammarPreparser preparser=new
XMLGrammarPreparser();
        
preparser.registerPreparser(XMLGrammarDescription.XML_DTD, null);
        
preparser.registerPreparser(XMLGrammarDescription.XML_SCHEMA, null);
                preparser.setGrammarPool(grammarPool);
                for (final URL grammar:preloadGrammars) {
                        if (logger.isInfoEnabled()) {
                                logger.info("Preloading grammar for
"+grammar.toExternalForm());
                        }
                        try {
                                final InputStream inputStream = new
BufferedInputStream(grammar.openStream());
                                final XMLInputSource source=new
XMLInputSource(null,grammar.toURI().getPath(),null,inputStream,null);
                                if (grammar.getPath().endsWith(".dtd"))
{
        
preparser.preparseGrammar(XMLGrammarDescription.XML_DTD, source);
                                } else {
        
preparser.preparseGrammar(XMLGrammarDescription.XML_SCHEMA, source);
                                }
                        } catch (final IOException e) {
                                // Preloading is just for convenience
                                logger.warn("Error trying to preload
grammar file "+grammar.toExternalForm(), e);
                        } catch (final URISyntaxException e) {
                                // Preloading is just for convenience
                                logger.warn("Error trying to preload
grammar file "+grammar.toExternalForm(), e);
                        }
                }
        }


-- 
Bill Michell 
Development Team Leader, Broadcast Platforms, BBC FM&T (Journalism). 

-----Original Message-----
From: Adam Retter [mailto:[EMAIL PROTECTED] 
Sent: 10 November 2008 11:23
To: j-users@xerces.apache.org
Subject: Problem validating using XMLGrammarPool with different Grammars
with no namespace

Hi Chaps,

I am having some problems with XMLGrammarPool and I would like to
confirm the cause of this.

I think that it is because I have a number of different XML documents
that use different XML Schemas, and these all use the same namespace
i.e. no namespace.

When I create a new XMLGrammarPool and validate document A with Schema 1
it works, however when I then validate document B with Schema 2 using
the same XMLGrammarPool it fails as it cant find the Schema. Both
document A and document B have no namespace defined.

e.g.

<a xsi:noNamespaceSchemaLocation="1.xsd"
xmlns:xsi="xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
...
</a>

<b xsi:noNamespaceSchemaLocation="2.xsd"
xmlns:xsi="xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
...
</b>


Our application needs to validate a number of different incoming XML
documents using their XML Schemas, a number of these documents all have
no namespace and use xsi:noNamespaceSchemaLocation="blah.xsd" to refer
to their Schemas.

I would like to use an XMLGrammarPool for caching Schemas to improve
performance of the system. I thought that by implementing my own
XMLGrammarPool I would be able to determine the uniqueness of an
XMLSchema by something other than its namespace, e.g. its location.
However this doesn't seem to be the case.

After some investigation with the debugger: at line 1373 of
XMLSchemaValidator in com.sun.org.apache.xerces.internal.impl.xs (Sun
JDK 1.6.0_06) I see this comment in reset(XMLComponentManager
componentManager) - 

// store the external schema locations. they are set when reset is
called,
// so any other schemaLocation declaration for the same namespace will
be
// effectively ignored. because we choose to take first location hint
// available for a particular namespace.



My question then is how and what do I need to do to get this working? Or
if it is currently possible, then what am I doing wrong?



Thanks


Adam Retter

Landmark Information Group Ltd
5-7 Abbey Court
Eagle Way
Sowton Industrial Estate
Exeter
Devon
EX2 7HY

t: +44(0)1392 685403
w: http://www.landmarkinfo.co.uk



Registered Office: 7 Abbey Court, Eagle Way, Sowton, Exeter, Devon, EX2
7HY
Registered Number 2892803 Registered in England and Wales 

This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 

The information contained in this e-mail is confidential and may be
subject to 
legal privilege. If you are not the intended recipient, you must not
use, copy, 
distribute or disclose the e-mail or any part of its contents or take
any 
action in reliance on it. If you have received this e-mail in error,
please 
e-mail the sender by replying to this message. All reasonable
precautions have 
been taken to ensure no viruses are present in this e-mail. Landmark
Information
Group Limited cannot accept responsibility for loss or damage arising
from the 
use of this e-mail or attachments and recommend that you subject these
to 
your virus checking procedures prior to use.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain
personal views which are not the views of the BBC unless specifically
stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in
reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
                                        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Registered Office: 7 Abbey Court, Eagle Way, Sowton, Exeter, Devon, EX2 7HY
Registered Number 2892803 Registered in England and Wales 

This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 

The information contained in this e-mail is confidential and may be subject to 
legal privilege. If you are not the intended recipient, you must not use, copy, 
distribute or disclose the e-mail or any part of its contents or take any 
action in reliance on it. If you have received this e-mail in error, please 
e-mail the sender by replying to this message. All reasonable precautions have 
been taken to ensure no viruses are present in this e-mail. Landmark Information
Group Limited cannot accept responsibility for loss or damage arising from the 
use of this e-mail or attachments and recommend that you subject these to 
your virus checking procedures prior to use.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to