Chris,

If you're trying to avoid writing code to make this work you may want to
consider using a more schema centric command-line program like xjparse [1]
or jaxp.SourceValidator [2] instead of dom.Counter. With either of those
you can specify a list of schema documents to use for validation.
Additionally xjparse provides an option for specifying an XML Catalog [3]
for resolving the schema locations.

Thanks.

[1] http://nwalsh.com/java/xjparse/
[2] http://xerces.apache.org/xerces2-j/samples-jaxp.html#SourceValidator
[3]
http://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

[EMAIL PROTECTED] wrote on 10/08/2007 08:20:18 PM:

> I think there's a better way which I'll sketch (because my project
> uses a version of Xerces that is from before the DOM Level 3
> interfaces were included, so does something similar using older
> stuff).
>
> A standard XML parser may be associated with an EntityResolver, which
> supports a method taking a URI and returning an InputSource from which
> the content may be read.  Similarly, when a reference to a schema
> namespace is found in a document (instance or schema) being read by a
> validating parser, some kind of resolver will be called, if one has
> been attached to the parser, to find the definition of the schema for
> that namespace.  The namespace URI is the argument to the relevant
> method.  This resolver thing (might be called LSResolver in the DOM
> Level 3 L&S) is an interface, and your implementation may do whatever
> it wants.  Thus, you could create the resolver with some root location
> in the file system as argument, or you could use
> ClassLoader.getSystemResourceAsStream() or you  could put the schemas
> in a database and retrieve their text from there.  Your resolver could
> consult any schema locations it accumulated during its lifetime if you
> had a way to capture these, and wouldn't have to use them literally,
> but could interpret them as it wished.
>
> I suggest you consult the Xerces docs about how to install a resolver
> for schemas.
>
> Jeff
>
> On 10/8/07, Chris Bray <[EMAIL PROTECTED]> wrote:
> > Michael, I'm using Xerces-J 2.9.1, I even upgraded from 2.9.0 today to
> > test any changes!
> >
> > Jeff, can you bear with me here I think I understand you...
> >
> > Jeff Greif wrote:
> > > Maybe an example will be clearer.
> > >
> > > The instance document is, relative to some subtree of the file
system, in
> > >
> > > instances/articles/doc1.xml
> > >
> > > There is a set of schemas that apply in
> > >
> > > schemas/{a,b,c,d}.xsd
> > >
> > > Suppose a.xsd imports b.xsd, and in addition, doc1.xml refers to
> > > components from nsa, the namespace of a, and nsb, the namespace of b.
> > >
> > > So there are schema locations of the form {nsa, ../../schemas/a.xsd
> > > nsb ../../schemas/b.xsd, ... }
> > >
> > > Now when the reference from doc1 -> nsb is found, the schema
locations
> > > can be used to find b.xsd.
> >
> > I'm with you up to here, because the schema locations were defined in
> > doc1.xml they are relative to doc1.xml and therefore point to the
> > correct xsd files.
> >
> >  > If the reference from a.xsd -> nsb is
> > > found, the schema locations will not work, because the location is
> > > incorrect relative to the location of a.xsd.
> >
> > My reference from a.xsd -> nsb is in the form
> >         <xsd:import namepsace="nsb" schemaLocation="./b.xsd" />
> > This path to b.xsd is correct with respect to the a.xsd it is defined
in
> > (although incorrect with respect to doc1.xml).
> >
> > However this schema location hint is second in the queue behind the one
> > specified in doc1.xml, when Xerces tries to use the one specified in
> > doc1.xml here it fails with File Not Found(because when relative to
> > a.xsd the doc1.xml's schema location is not valid), reports the error
> > and stops parsing so the schema location specified here is never used.
> >
> > Other parsers continue looking at the hints in schema location and find
> > the correct one specified on the <xsd:import> line, is there any way of
> > telling Xerces to try all hints matching that namespace (in the same
way
> > XMLSpy, Microsoft .NET's System.Xml and Saxonica seem to do) rather
than
> > stop on the first "not found"?
> >
> >  > You couldn't solve the
> > > problem by changing the schema locations to look like {nsa,
> > > ../../schemas/a.xsd nsb ./b.xsd, ... } because the doc1 -> nsb
> > > reference would fail.  However, in the first case, if the parser is
> > > caching grammars, and the reference from doc1 -> nsb has already been
> > > processed, the a.xsd -> nsb reference might not be a validation error
> > > -- the schema locations are only a hint to the parser, and if it has
> > > located and parsed the right grammar already, it can use it.
> >
> > So changing the schemaLocation  works in my case because in processing
> > a.xsd the parser finds b.xsd (via the schemaLocation relative to a.xsd)
> > and caches it, therefore meaning it can use the cached copy in
doc1.xml.
> >
> > > These are the problems with using relative URLs for the schema
> > > locations, except in certain special cases.  For example, if the
> > > instance doc is
> > >
> > > instances/doc1.xml
> > >
> > > and the schemas are in
> > >
> > > schemas/{a,b,c,...}.xsd
> > >
> > > Then these schema locations:  {nsa ../schemas/a.xsd nsb
> > > ../schemas/b.xsd ...} will work successfully, but only because the
> > > paths work whether the reference is from the instance doc or a schema
> > > doc.
> >
> > Ideally I'd like to specify a "try all schema locations before error"
or
> > "do not stop on file not found error" property since there will
*always*
> > be one that works when used relative to the current location, is there
a
> > way of doing this?
> >
> > I'm guessing there is no "schema locations per file" property to turn
> > off the global cache of schema location and switch to a per-file cache?
> > Thus forcing Xerces to use the hint found at the current location.
> >
> > Maybe the easiest way to solve my problem is to re-jig my document
> > locations so that the same relative path can be used to locate each of
> > the schemas? Not ideal mind since I've spent a long time developing the
> > inter-schema links to ensure they can always be linked together and I'd
> > like to use that investment in some way and I can't help but think that
> > moving the files so the relative paths fit for both scenarios is more
of
> > a by-product than something implemented by design.
> >
> > I'm under some commercial pressure here to switch to the method that
> > works with the system that the customers use (XMLSpy et al) but I'd
> > really like the same examples to work in Xerces-J, we've been extolling
> > the virtues of XML and XMLSchema as the "common language" to unify our
> > industry's data exchange and it'd look bad to have to change the
> > examples we are producing to make them work in different parsers!
> >
> > Once again, that ended up a lot longer than I expected and I hope it
> > makes sense, thanks for your time and patience.
> > Chris
> >
> > > Jeff
> > >
> > >
> > >
> > > On 10/8/07, Chris Bray <[EMAIL PROTECTED]> wrote:
> > >> Jeff.
> > >>
> > >> My comments inline.
> > >>
> > >> Chris
> > >>
> > >> Jeff Greif wrote:
> > >>> When a relative URL is used for the location of an imported schema,
it
> > >>> is supposed to be relative to the URL of the importing document.
So
> > >>> if your instance document directly references the namespaces of one
or
> > >>> more schemas for validation, whose URLs are interpreted relative to
> > >>> the location of the instance document.  Probably some of the
schemas
> > >>>
> > >> So my instance document _should_ have relative paths to the
individual
> > >> schemas in it's schemaLocation?
> > >> Does the fact that Xerces is "changing" the base path to that of the
> > >> first specified schema for each subsequent schema constitute a bug?
> > >> Should I log this somewhere more formal?
> > >>> contain <xsd:import> elements; those would require URLs relative to
> > >>> the schema importing them.
> > >>>
> > >> Each of those schemas then further includes others using
<xsd:import>
> > >> and <xsd:include> (for example core.xsd actually includes about 30
or 40
> > >> smaller schemas from ./Core/schemaname.xsd) and this works as I'd
> > >> expected it to.
> > >>> Some of the schemas might be referenced both in the instance
document
> > >>> and in imports from other schemas referenced in the instance
document.
> > >>>  I'm not sure there's a specification of where they must be found
if
> > >>> relative URLs are used.  This may depend on the ordering of
processing
> > >>> of those references by the parser/validator.
> > >>>
> > >> When that is the case I am 100% sure that both the instance document
and
> > >> the "sub schemas" refer to the exact same document, so it shouldn't
> > >> matter which of the references Xerces is using, it will resolve to
the
> > >> same schema anyway.
> > >>> There is a section in the XML Schema 1.0 spec addressing this
issue.
> > >>>
> > >>> Jeff
> > >>>
> > >>>
> > >>>
> > >>> On 10/8/07, Chris Bray <[EMAIL PROTECTED]> wrote:
> > >>>
> > >>>> Parshant,
> > >>>>
> > >>>> Changing the working dir of the JVM doesn't seem to make any
> difference,
> > >>>> using dom.Counter from the Xerces-J samples the parser still seems
to
> > >>>> change the working dir first to wherever the xml file is located,
then
> > >>>> to wherever the first xsd file specified is located and need all
> > >>>> subsequent locations to be relative to that.
> > >>>>
> > >>>> Absolute paths work fine but I'm trying to include these files
bundled
> > >>>> in with a set of schema as examples of how to use the format,
hence I
> > >>>> don't know where my users will unzip the archives to (C:
> \Users\username,
> > >>>> c:\projects\projectname\, /usr/local/projects, /home etc) so
> I can't set
> > >>>> absolute paths in my distributed files.
> > >>>>
> > >>>> I was hoping to not need to actually write my own parsing program,
just
> > >>>> use the output from dom.Counter and a schemaLocation hint
> (which fits my
> > >>>> needs perfectly) since I'm not really a Java developer.
> > >>>>
> > >>>> I saw that jEdit page but I'd rather make my schemas
validateagainst a
> > >>>> standard Xerces installation than modify my jEdit installation to
make
> > >>>> them work, I feel this would be more useful for my users.
> > >>>>
> > >>>> Chris
> > >>>>
> > >>>>
> > >>>> Prashant Reddy wrote:
> > >>>>
> > >>>>> I think the relative paths you have specified in the
> schemaLocation will
> > >>>>> be resolved against the "working dir". The working dir is usually
the
> > >>>>> directory at the cmd prompt when you launched the JVM.
> > >>>>>
> > >>>>> Have you tried giving absolute path to the XSD files ?
> > >>>>>
> > >>>>> A more portable solution to finding schema files locally is to
use
> > >>>>> EntityResolver[1].
> > >>>>>
> > >>>>> If you are using JAXP 1.3/ JDK 1.5+ see :
> > >>>>> https://jaxp.dev.java.net/article/jaxp-1_3-article.html
> > >>>>>
> > >>>>>
> > >>>>> [1]:http://java.sun.com/j2se/1.5.
> 0/docs/api/org/xml/sax/EntityResolver.html
> > >>>>>
> > >>>>> Hope this helps.
> > >>>>> -Prashant
> > >>>>>
> > >>>>>
> > >>>>> On Mon, 2007-10-08 at 13:17 +0100, Chris Bray wrote:
> > >>>>>
> > >>>>>
> > >>>>>> All.
> > >>>>>>
> > >>>>>> Please go easy on me as I'm a newbie here, if this is a
> really obvious
> > >>>>>> problem I'm really sorry!
> > >>>>>> I've been using Xerces to validate XML for a while now, and
> I've found a
> > >>>>>> troublesome scenario.
> > >>>>>>
> > >>>>>> In the top of my xml files I have a line specifying the
> location of the
> > >>>>>> external schemas required for this xml file like so:
> > >>>>>>
> > >>>>>>     xsi:schemaLocation="http://www.diggsml.org/0.9.2
> > >>>>>> ../Schemas/diggs/core.xsd http://www.diggsml.org/0.9.2
/geotechnical
> > >>>>>> ../Schemas/diggs/geotechnical.xsd "
> > >>>>>>
> > >>>>>> In this case specifying two namespaces and their associated
> schema files
> > >>>>>> (files exist and paths are correct).
> > >>>>>>
> > >>>>>> However this doesn't work using Xerces. I am required to change
my
> > >>>>>> schemaLocation attribute so that the first path points to
> its xsd, then
> > >>>>>> subsequent entries are relative to that first xsd, not to the
current
> > >>>>>> file, like so:
> > >>>>>>
> > >>>>>>     xsi:schemaLocation=" http://www.diggsml.org/0.9.2
> > >>>>>> ../Schemas/diggs/core.xsd http://www.diggsml.org/0.9.2
/geotechnical
> > >>>>>> ../geotechnical.xsd "
> > >>>>>>
> > >>>>>> Is there any way I can change this to work like the first
example, as
> > >>>>>> other parsers (XMLSpy and Stylus Studio in particular)
> require the first
> > >>>>>> syntax, all paths relative to current doc, what I believe
> to be correct
> > >>>>>> behaviour. I don't know how to build Xerces-J from source
> to fix(?) this
> > >>>>>> myself but I'd be willing to try if anyone can help me get
> it building.
> > >>>>>>
> > >>>>>> Since my customers are all using XMLSpy etc I'm having to
produce my
> > >>>>>> example files in the earlier syntax, stopping my from
usingXerces to
> > >>>>>> validate them.
> > >>>>>>
> > >>>>>> As the biggest advocate of Free/OpenSource software in our
> group (jEdit
> > >>>>>> with Xerces plugin in particular) I really don't want to
> have to change
> > >>>>>> to use XMLSpy or Stylus Studio but this is quite awkward for me!
> > >>>>>>
> > >>>>>> That ended up being a longer mail than I'd expected! I hope you
can
> > >>>>>> help, if there's any more information you need (or a small
> set of sample
> > >>>>>> files) let me know.
> > >>>>>>
> > >>>>>>
> > >>>>>> Chris Bray
> > >>>>>> Software Engineer (DIGGS Project)
> > >>>>>> Keynetix Ltd.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to