Re: xsi:schemaLocation in XML files

Chris Bray Wed, 10 Oct 2007 01:48:13 -0700

All.

That's exactly what I decided to do, my URI's all point to canonicalschema locations that will eventually be URL's that refer to filelocations on the server, however currently they don't exist since theproject has not been released for the public.Therefore my testers can use a catalog (I'll provide them one they canjust copy and paste different file locations into) to map thosecanonical URI's to real files on their computers thus allowing them to test.Once the project is live the schemas will be available in theselocations and the catalog will only be required to avoid unnecessaryserver trips, and anonymous users will be able to use the internet basedschemas.The only down side is that I now have to write a bit of documentation onhow to modify the catalog to point to their local copies, and how to getit working in their editor/parser (everything from jEdit and Xerces,XMLSpy, Oxygen and Stylus Studio) but there's some good sites alreadythat I can refer to.


Thanks everyone for your help :)

Chris Bray


Jacob Kjome wrote:

Normally, you would specify a canonical schema location. This mightjust be the name of the file or some official qualified name.Usually, a URL is used, which sometimes actually points to a locationwhere the file can be downloaded. By hosting the file at a live URL,users can use this canonical schema location and obtain the schema aslong as they have an Internet connection. On the other hand, you canalso provide a catalog to avoid unnecessary trips to the server andavoid the requirement to be connected to the Internet. So, you mightdo...


<uri name="http://www.myhost.com/schemas/a.xsd";
uri="file:///c:/svn/project/trunk/source/schemas/a.xsd"/>

If you always plan on using the catalog,http://www.myhost.com/schemas/a.xsd doesn't even have to be a reallive host. It can be treated merely as a canonical schema locationidentifier for the catalog to key upon.


Jake

At 06:08 AM 10/9/2007, you wrote:
>
>It looks like xjParse does what I need by specifying -S a.xsd -S b.xsd
>etc on the command line, so I do still have a "no code" solution :)
>
>The Xerces plugin for jEdit does also have the facility to import
>catalog files, would I be right in assuming I can write a catalog file,
>use that in jEdit, and with xjParse?
>
>Now I'm really confused with catalogs though!
>
>If my catalog has a
>    <uri name="a.xsd"
>uri="file:///c:/svn/project/trunk/source/schemas/a.xsd"/>
>
>does it also need a
>    <uri name="../a.xsd"
>uri="file:///c:/svn/project/trunk/source/schemas/a.xsd"/>
>
>for when the files included by b.xsd also include a? The relative path
>is correct but I don't know if I need an entry in the catalog?
>
>Michael Glavassevich wrote:
>> Chris,
>>

>> If you're trying to avoid writing code to make this work you maywant to>> consider using a more schema centric command-line program likexjparse [1]>> or jaxp.SourceValidator [2] instead of dom.Counter. With either ofthose

>> you can specify a list of schema documents to use for validation.

>> Additionally xjparse provides an option for specifying an XMLCatalog [3]

>> for resolving the schema locations.
>>
>> Thanks.
>>
>> [1] http://nwalsh.com/java/xjparse/

>> [2]http://xerces.apache.org/xerces2-j/samples-jaxp.html#SourceValidator

>> [3]

>>http://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html

>>
>> Michael Glavassevich
>> XML Parser Development
>> IBM Toronto Lab
>> E-mail: [EMAIL PROTECTED]
>> E-mail: [EMAIL PROTECTED]
>>
>> [EMAIL PROTECTED] wrote on 10/08/2007 08:20:18 PM:
>>
>>
>>> I think there's a better way which I'll sketch (because my project
>>> uses a version of Xerces that is from before the DOM Level 3
>>> interfaces were included, so does something similar using older
>>> stuff).
>>>
>>> A standard XML parser may be associated with an EntityResolver, which

>>> supports a method taking a URI and returning an InputSource fromwhich

>>> the content may be read.  Similarly, when a reference to a schema
>>> namespace is found in a document (instance or schema) being read by a
>>> validating parser, some kind of resolver will be called, if one has
>>> been attached to the parser, to find the definition of the schema for
>>> that namespace.  The namespace URI is the argument to the relevant
>>> method.  This resolver thing (might be called LSResolver in the DOM
>>> Level 3 L&S) is an interface, and your implementation may do whatever

>>> it wants. Thus, you could create the resolver with some rootlocation

>>> in the file system as argument, or you could use
>>> ClassLoader.getSystemResourceAsStream() or you  could put the schemas

>>> in a database and retrieve their text from there. Your resolvercould>>> consult any schema locations it accumulated during its lifetime ifyou

>>> had a way to capture these, and wouldn't have to use them literally,
>>> but could interpret them as it wished.
>>>
>>> I suggest you consult the Xerces docs about how to install a resolver
>>> for schemas.
>>>
>>> Jeff
>>>
>>> On 10/8/07, Chris Bray <[EMAIL PROTECTED]> wrote:
>>>

>>>> Michael, I'm using Xerces-J 2.9.1, I even upgraded from 2.9.0today to

>>>> test any changes!
>>>>
>>>> Jeff, can you bear with me here I think I understand you...
>>>>
>>>> Jeff Greif wrote:
>>>>
>>>>> Maybe an example will be clearer.
>>>>>
>>>>> The instance document is, relative to some subtree of the file
>>>>>
>> system, in
>>
>>>>> instances/articles/doc1.xml
>>>>>
>>>>> There is a set of schemas that apply in
>>>>>
>>>>> schemas/{a,b,c,d}.xsd
>>>>>
>>>>> Suppose a.xsd imports b.xsd, and in addition, doc1.xml refers to

>>>>> components from nsa, the namespace of a, and nsb, the namespaceof b.

>>>>>
>>>>> So there are schema locations of the form {nsa, ../../schemas/a.xsd
>>>>> nsb ../../schemas/b.xsd, ... }
>>>>>
>>>>> Now when the reference from doc1 -> nsb is found, the schema
>>>>>
>> locations
>>
>>>>> can be used to find b.xsd.
>>>>>

>>>> I'm with you up to here, because the schema locations weredefined in

>>>> doc1.xml they are relative to doc1.xml and therefore point to the
>>>> correct xsd files.
>>>>
>>>>  > If the reference from a.xsd -> nsb is
>>>>
>>>>> found, the schema locations will not work, because the location is
>>>>> incorrect relative to the location of a.xsd.
>>>>>
>>>> My reference from a.xsd -> nsb is in the form
>>>>         <xsd:import namepsace="nsb" schemaLocation="./b.xsd" />

>>>> This path to b.xsd is correct with respect to the a.xsd it isdefined

>>>>
>> in
>>
>>>> (although incorrect with respect to doc1.xml).
>>>>

>>>> However this schema location hint is second in the queue behindthe one

>>>> specified in doc1.xml, when Xerces tries to use the one specified in
>>>> doc1.xml here it fails with File Not Found(because when relative to

>>>> a.xsd the doc1.xml's schema location is not valid), reports theerror>>>> and stops parsing so the schema location specified here is neverused.

>>>>

>>>> Other parsers continue looking at the hints in schema locationand find>>>> the correct one specified on the <xsd:import> line, is there anyway of

>>>> telling Xerces to try all hints matching that namespace (in the same
>>>>
>> way
>>
>>>> XMLSpy, Microsoft .NET's System.Xml and Saxonica seem to do) rather
>>>>
>> than
>>
>>>> stop on the first "not found"?
>>>>
>>>>  > You couldn't solve the
>>>>
>>>>> problem by changing the schema locations to look like {nsa,
>>>>> ../../schemas/a.xsd nsb ./b.xsd, ... } because the doc1 -> nsb
>>>>> reference would fail.  However, in the first case, if the parser is

>>>>> caching grammars, and the reference from doc1 -> nsb has alreadybeen>>>>> processed, the a.xsd -> nsb reference might not be a validationerror>>>>> -- the schema locations are only a hint to the parser, and if ithas

>>>>> located and parsed the right grammar already, it can use it.
>>>>>

>>>> So changing the schemaLocation works in my case because inprocessing>>>> a.xsd the parser finds b.xsd (via the schemaLocation relative toa.xsd)

>>>> and caches it, therefore meaning it can use the cached copy in
>>>>
>> doc1.xml.
>>
>>>>> These are the problems with using relative URLs for the schema
>>>>> locations, except in certain special cases.  For example, if the
>>>>> instance doc is
>>>>>
>>>>> instances/doc1.xml
>>>>>
>>>>> and the schemas are in
>>>>>
>>>>> schemas/{a,b,c,...}.xsd
>>>>>
>>>>> Then these schema locations:  {nsa ../schemas/a.xsd nsb
>>>>> ../schemas/b.xsd ...} will work successfully, but only because the

>>>>> paths work whether the reference is from the instance doc or aschema

>>>>> doc.
>>>>>

>>>> Ideally I'd like to specify a "try all schema locations beforeerror"

>>>>
>> or
>>
>>>> "do not stop on file not found error" property since there will
>>>>
>> *always*
>>

>>>> be one that works when used relative to the current location, isthere

>>>>
>> a
>>
>>>> way of doing this?
>>>>

>>>> I'm guessing there is no "schema locations per file" property toturn>>>> off the global cache of schema location and switch to a per-filecache?

>>>> Thus forcing Xerces to use the hint found at the current location.
>>>>
>>>> Maybe the easiest way to solve my problem is to re-jig my document

>>>> locations so that the same relative path can be used to locateeach of>>>> the schemas? Not ideal mind since I've spent a long timedeveloping the>>>> inter-schema links to ensure they can always be linked togetherand I'd>>>> like to use that investment in some way and I can't help butthink that>>>> moving the files so the relative paths fit for both scenarios ismore

>>>>
>> of
>>
>>>> a by-product than something implemented by design.
>>>>
>>>> I'm under some commercial pressure here to switch to the method that
>>>> works with the system that the customers use (XMLSpy et al) but I'd

>>>> really like the same examples to work in Xerces-J, we've beenextolling>>>> the virtues of XML and XMLSchema as the "common language" tounify our

>>>> industry's data exchange and it'd look bad to have to change the
>>>> examples we are producing to make them work in different parsers!
>>>>
>>>> Once again, that ended up a lot longer than I expected and I hope it
>>>> makes sense, thanks for your time and patience.
>>>> Chris
>>>>
>>>>
>>>>> Jeff
>>>>>
>>>>>
>>>>>
>>>>> On 10/8/07, Chris Bray <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Jeff.
>>>>>>
>>>>>> My comments inline.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> Jeff Greif wrote:
>>>>>>

>>>>>>> When a relative URL is used for the location of an importedschema,

>>>>>>>
>> it
>>
>>>>>>> is supposed to be relative to the URL of the importing document.
>>>>>>>
>> So
>>

>>>>>>> if your instance document directly references the namespacesof one

>>>>>>>
>> or
>>

>>>>>>> more schemas for validation, whose URLs are interpretedrelative to

>>>>>>> the location of the instance document.  Probably some of the
>>>>>>>
>> schemas
>>
>>>>>> So my instance document _should_ have relative paths to the
>>>>>>
>> individual
>>
>>>>>> schemas in it's schemaLocation?

>>>>>> Does the fact that Xerces is "changing" the base path to thatof the>>>>>> first specified schema for each subsequent schema constitute abug?

>>>>>> Should I log this somewhere more formal?
>>>>>>

>>>>>>> contain <xsd:import> elements; those would require URLsrelative to

>>>>>>> the schema importing them.
>>>>>>>
>>>>>>>
>>>>>> Each of those schemas then further includes others using
>>>>>>
>> <xsd:import>
>>
>>>>>> and <xsd:include> (for example core.xsd actually includes about 30
>>>>>>
>> or 40
>>
>>>>>> smaller schemas from ./Core/schemaname.xsd) and this works as I'd
>>>>>> expected it to.
>>>>>>
>>>>>>> Some of the schemas might be referenced both in the instance
>>>>>>>
>> document
>>
>>>>>>> and in imports from other schemas referenced in the instance
>>>>>>>
>> document.
>>
>>>>>>>  I'm not sure there's a specification of where they must be found
>>>>>>>
>> if
>>
>>>>>>> relative URLs are used.  This may depend on the ordering of
>>>>>>>
>> processing
>>
>>>>>>> of those references by the parser/validator.
>>>>>>>
>>>>>>>

>>>>>> When that is the case I am 100% sure that both the instancedocument

>>>>>>
>> and
>>

>>>>>> the "sub schemas" refer to the exact same document, so itshouldn't

>>>>>> matter which of the references Xerces is using, it will resolve to
>>>>>>
>> the
>>
>>>>>> same schema anyway.
>>>>>>
>>>>>>> There is a section in the XML Schema 1.0 spec addressing this
>>>>>>>
>> issue.
>>
>>>>>>> Jeff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/8/07, Chris Bray <[EMAIL PROTECTED]> wrote:
>>>>>>>
>>>>>>>
>>>>>>>> Parshant,
>>>>>>>>
>>>>>>>> Changing the working dir of the JVM doesn't seem to make any
>>>>>>>>
>>> difference,
>>>

>>>>>>>> using dom.Counter from the Xerces-J samples the parser stillseems

>>>>>>>>
>> to
>>

>>>>>>>> change the working dir first to wherever the xml file islocated,

>>>>>>>>
>> then
>>
>>>>>>>> to wherever the first xsd file specified is located and need all
>>>>>>>> subsequent locations to be relative to that.
>>>>>>>>
>>>>>>>> Absolute paths work fine but I'm trying to include these files
>>>>>>>>
>> bundled
>>
>>>>>>>> in with a set of schema as examples of how to use the format,
>>>>>>>>
>> hence I
>>
>>>>>>>> don't know where my users will unzip the archives to (C:
>>>>>>>>
>>> \Users\username,
>>>
>>>>>>>> c:\projects\projectname\, /usr/local/projects, /home etc) so
>>>>>>>>
>>> I can't set
>>>
>>>>>>>> absolute paths in my distributed files.
>>>>>>>>

>>>>>>>> I was hoping to not need to actually write my own parsingprogram,

>>>>>>>>
>> just
>>
>>>>>>>> use the output from dom.Counter and a schemaLocation hint
>>>>>>>>
>>> (which fits my
>>>
>>>>>>>> needs perfectly) since I'm not really a Java developer.
>>>>>>>>
>>>>>>>> I saw that jEdit page but I'd rather make my schemas
>>>>>>>>
>> validateagainst a
>>

>>>>>>>> standard Xerces installation than modify my jEditinstallation to

>>>>>>>>
>> make
>>
>>>>>>>> them work, I feel this would be more useful for my users.
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>>
>>>>>>>> Prashant Reddy wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> I think the relative paths you have specified in the
>>>>>>>>>
>>> schemaLocation will
>>>

>>>>>>>>> be resolved against the "working dir". The working dir isusually

>>>>>>>>>
>> the
>>
>>>>>>>>> directory at the cmd prompt when you launched the JVM.
>>>>>>>>>
>>>>>>>>> Have you tried giving absolute path to the XSD files ?
>>>>>>>>>
>>>>>>>>> A more portable solution to finding schema files locally is to
>>>>>>>>>
>> use
>>
>>>>>>>>> EntityResolver[1].
>>>>>>>>>
>>>>>>>>> If you are using JAXP 1.3/ JDK 1.5+ see :
>>>>>>>>> https://jaxp.dev.java.net/article/jaxp-1_3-article.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1]:http://java.sun.com/j2se/1.5.
>>>>>>>>>
>>> 0/docs/api/org/xml/sax/EntityResolver.html
>>>
>>>>>>>>> Hope this helps.
>>>>>>>>> -Prashant
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, 2007-10-08 at 13:17 +0100, Chris Bray wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> All.
>>>>>>>>>>
>>>>>>>>>> Please go easy on me as I'm a newbie here, if this is a
>>>>>>>>>>
>>> really obvious
>>>
>>>>>>>>>> problem I'm really sorry!
>>>>>>>>>> I've been using Xerces to validate XML for a while now, and
>>>>>>>>>>
>>> I've found a
>>>
>>>>>>>>>> troublesome scenario.
>>>>>>>>>>
>>>>>>>>>> In the top of my xml files I have a line specifying the
>>>>>>>>>>
>>> location of the
>>>
>>>>>>>>>> external schemas required for this xml file like so:
>>>>>>>>>>
>>>>>>>>>>     xsi:schemaLocation="http://www.diggsml.org/0.9.2
>>>>>>>>>> ../Schemas/diggs/core.xsd http://www.diggsml.org/0.9.2
>>>>>>>>>>
>> /geotechnical
>>
>>>>>>>>>> ../Schemas/diggs/geotechnical.xsd "
>>>>>>>>>>
>>>>>>>>>> In this case specifying two namespaces and their associated
>>>>>>>>>>
>>> schema files
>>>
>>>>>>>>>> (files exist and paths are correct).
>>>>>>>>>>

>>>>>>>>>> However this doesn't work using Xerces. I am required tochange

>>>>>>>>>>
>> my
>>
>>>>>>>>>> schemaLocation attribute so that the first path points to
>>>>>>>>>>
>>> its xsd, then
>>>
>>>>>>>>>> subsequent entries are relative to that first xsd, not to the
>>>>>>>>>>
>> current
>>
>>>>>>>>>> file, like so:
>>>>>>>>>>
>>>>>>>>>>     xsi:schemaLocation=" http://www.diggsml.org/0.9.2
>>>>>>>>>> ../Schemas/diggs/core.xsd http://www.diggsml.org/0.9.2
>>>>>>>>>>
>> /geotechnical
>>
>>>>>>>>>> ../geotechnical.xsd "
>>>>>>>>>>
>>>>>>>>>> Is there any way I can change this to work like the first
>>>>>>>>>>
>> example, as
>>
>>>>>>>>>> other parsers (XMLSpy and Stylus Studio in particular)
>>>>>>>>>>
>>> require the first
>>>
>>>>>>>>>> syntax, all paths relative to current doc, what I believe
>>>>>>>>>>
>>> to be correct
>>>
>>>>>>>>>> behaviour. I don't know how to build Xerces-J from source
>>>>>>>>>>
>>> to fix(?) this
>>>
>>>>>>>>>> myself but I'd be willing to try if anyone can help me get
>>>>>>>>>>
>>> it building.
>>>
>>>>>>>>>> Since my customers are all using XMLSpy etc I'm having to
>>>>>>>>>>
>> produce my
>>
>>>>>>>>>> example files in the earlier syntax, stopping my from
>>>>>>>>>>
>> usingXerces to
>>
>>>>>>>>>> validate them.
>>>>>>>>>>
>>>>>>>>>> As the biggest advocate of Free/OpenSource software in our
>>>>>>>>>>
>>> group (jEdit
>>>
>>>>>>>>>> with Xerces plugin in particular) I really don't want to
>>>>>>>>>>
>>> have to change
>>>

>>>>>>>>>> to use XMLSpy or Stylus Studio but this is quite awkwardfor me!

>>>>>>>>>>

>>>>>>>>>> That ended up being a longer mail than I'd expected! I hopeyou

>>>>>>>>>>
>> can
>>
>>>>>>>>>> help, if there's any more information you need (or a small
>>>>>>>>>>
>>> set of sample
>>>
>>>>>>>>>> files) let me know.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Chris Bray
>>>>>>>>>> Software Engineer (DIGGS Project)
>>>>>>>>>> Keynetix Lt.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: xsi:schemaLocation in XML files

Reply via email to