Re: Problem with parsing HTML

2012-05-13 Thread Michael Glavassevich
Perhaps you already know... NekoHTML is maintained by another community out in SourceForge [1]. Thanks. [1] http://sourceforge.net/tracker/?group_id=195122&atid=952178 Michael Glavassevich XML Technologies and WAS Development IBM Toronto Lab E-mail: mrgla...@ca.ibm.com E-mail: mrgla...@apache.o

Re: Problem with parsing HTML

2012-05-13 Thread Yizhou Z.
Just tried out parsing some other HTML files, and found Xerces worked well for the "input" tags in these HTML files. The previous problem seems to have something to do with NekoHTML's parser. On Sun, May 13, 2012 at 1:22 PM, Yizhou Z. wrote: > NekoHTML parser uses Xerces' HTML DOM implementation

Re: Problem with parsing HTML

2012-05-12 Thread Yizhou Z.
NekoHTML parser uses Xerces' HTML DOM implementation. And it seems that it can always return the appropriate HTML DOM element objects for other types of element nodes. But for , I found it returns an object of type "org.apache.xerces.dom.ElementNSImpl". I wonder if this is a bug in the version of

Re: Problem with parsing HTML

2012-05-12 Thread Michael Glavassevich
Have you tried setting the 'document-class-name' property [1] so that it points to Xerces' HTML DOM implementation? Thanks. [1] http://xerces.apache.org/xerces2-j/properties.html#dom.document-class-name Michael Glavassevich XML Technologies and WAS Development IBM Toronto Lab E-mail: mrgla...@