Re: Is there a HTML parser who can reconstruct the original html EXACTLY?

Paul Boddie Wed, 23 Jan 2008 06:11:21 -0800

On 23 Jan, 14:20, kliu <[EMAIL PROTECTED]> wrote:
>
> Thank u for your reply. but what I really need is the mapping between
> each DOM nodes and the corresponding original source segment.


At the risk of promoting unfashionable DOM technologies, you can at
least serialise fragments of the DOM in libxml2dom [1]:

  import libxml2dom
  d = libxml2dom.parseURI("http://www.diveintopython.org/";, html=1)
  print d.xpath("//p")[7].toString()

Storage and retrieval of the original line and offset information may
be supported by libxml2, but such information isn't exposed by
libxml2dom.

Paul

[1] http://www.python.org/pypi/libxml2dom
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Is there a HTML parser who can reconstruct the original html EXACTLY?

Reply via email to