Hi Mauro, Mauro Molinari <[EMAIL PROTECTED]> wrote on 06/04/2008 04:36:42 AM:
> Hi Michael, > thank you for your reply. In the end, I found a "solution" to my problem. > First of all, I had to call DocumentBuilder.parse(InputStream, String), > rather than DocumentBuilder.parse(InputStream) in order to make the > parser find schemas, DTDs etc. referenced by the XML file, otherwise it > searched for them using my IDE working directory as the base for > resolving relative paths... You should always provide a base URI to the parser may need to resolve any relative ones. If none is specified Xerces will fall back to using the current working directory (the value of the system property user.dir) as the base URI for resolution. Even better if you can let the parser open the InputStream itself (e.g. using DocumentBuilder.parse(String)) where it has an opportunity to refresh the base URI if it got redirected as a result of opening the URLConnection. Specifically relevant for HTTP URLs. > Once I understood this, I found that the following code can do the job: > > DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); > dbf.setNamespaceAware(true); > dbf.setValidating(true); > dbf.setFeature("http://apache.org/xml/features/validation/schema", true); > DocumentBuilder db = dbf.newDocumentBuilder(); > Document document = db.parse(xmlURL.openStream(), > xmlURL.toURI().toString()); > > In this way I don't have to specify a schema (it is automatically taken > and parsed by Xerces thanks to the schemaLocation attribute in the XML), > but I lose the abstraction from the underlying parser implementation by > setting the Xerces feature needed to make schema validation (and > getElementById()) work. Please remember that I'm using standard JSE 5 > APIs to do the XML parsing. Right. That's a Xerces specific feature, but you could have used JAXP (e.g. SchemaFactory) to accomplish the same thing and not tied yourself to the Xerces implementation. > So, I then decided to write a DTD for the XML and make the parser use it > to enable getElementById(), although I don't like the solution so much > (actually, having the schema, in this case the DTD is redundant, I use > it only to make the parser work as expected, without the need of setting > any Xerces-specific feature on the document builder factory). There's no reason you couldn't have done this with schema too. As I said above, you don't need to set any Xerces specific features. > The resulting code is now: > > DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); > dbf.setNamespaceAware(true); > DocumentBuilder db = dbf.newDocumentBuilder(); > Document document = db.parse(xmlURL.openStream(), > xmlURL.toURI().toString()); > > that actually seems more implementation-independent to me. > > Thanks again for your help! > > -- > Mauro Molinari > Software Developer > [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED]