Hi Daniel,

The HTML DOM implementation in Xerces is ancient. It implements DOM Level 
1 HTML [1][2] which was intended for use with HTML 4.0 documents only. It 
does not recognize XHTML [3].

[1] http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-html.html
[2] 
http://www.w3.org/TR/2003/REC-DOM-Level-2-HTML-20030109/html.html#ID-5353782642
[3] http://issues.apache.org/jira/browse/XERCESJ-890

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

Daniel Farinha <[EMAIL PROTECTED]> wrote on 03/27/2006 09:56:03 
AM:

> Hi all,
> 
> I'm parsing an XHTML document using Xerces.
> This is the code that I'm using to parse the document:
> 
> String xhtmlSource = "<the xhtml source>";
> DOMParser parser = new DOMParser();
> parser.setProperty("
http://apache.org/xml/properties/dom/document-class-name
> ","org.apache.html.dom.HTMLDocumentImpl");
> InputSource iSource = new InputSource(new StringReader(xhtmlSource));
> parser.parse(iSource);
> HTMLDocumentImpl document = (HTMLDocumentImpl)parser.getDocument();
> 
> The parsing seems to work, except when I query the HTMLDocumentImpl most 

> nodes are of type |ElementNSImpl |rather than the actual apache HTML DOM 

> implementation classes. (For example, I can't even do a 
> document.getBody() - it returns null. Instead I have to walk the XML DOM 

> looking for the 'body' node).
> 
> This behaviour is described in NekoHTML's 'Requirements and Limitations' 

> section at http://people.apache.org/~andyc/neko/doc/html/index.html
> 
> I'm not using NekoHTML, and I'm currently using Xerces 2.8.0. I did try 
> various versions of Xerces but to no avail.
> 
> I'm having to carry on working with plain nodes, but I'd much rather 
> work with the HTML DOM.
> Can anyone give any hints?
> 
> Thanks in advance.
> 
> Daniel
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to