Hi all,

I'm parsing an XHTML document using Xerces.
This is the code that I'm using to parse the document:

String xhtmlSource = "<the xhtml source>";
DOMParser parser = new DOMParser();
parser.setProperty("http://apache.org/xml/properties/dom/document-class-name","org.apache.html.dom.HTMLDocumentImpl";);
InputSource iSource = new InputSource(new StringReader(xhtmlSource));
parser.parse(iSource);
HTMLDocumentImpl document = (HTMLDocumentImpl)parser.getDocument();

The parsing seems to work, except when I query the HTMLDocumentImpl most nodes are of type |ElementNSImpl |rather than the actual apache HTML DOM implementation classes. (For example, I can't even do a document.getBody() - it returns null. Instead I have to walk the XML DOM looking for the 'body' node).

This behaviour is described in NekoHTML's 'Requirements and Limitations' section at http://people.apache.org/~andyc/neko/doc/html/index.html

I'm not using NekoHTML, and I'm currently using Xerces 2.8.0. I did try various versions of Xerces but to no avail.

I'm having to carry on working with plain nodes, but I'd much rather work with the HTML DOM.
Can anyone give any hints?

Thanks in advance.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to