Larry Bates wrote: > [EMAIL PROTECTED] wrote: > > > > I am looking for any Python library which can help to get DOM > > tree from HTML. Is there any way to access HTML DOM, just like > > accessing it using javascript.
[...] > Since the browser can't execute anything except Javascript, you Who said anything about the browser? Accessing a DOM "just like [...] javascript" can mean a number of things: using an API like the one JavaScript uses, for example, as well as actually accessing a DOM associated with a page in a browser. > can't get to/manipulate the DOM with anything but Javascript code. > There have been attempts at getting a browser that can execute > Python code, but I don't think they ever really got anywhere. Actually, this isn't strictly true either. Disregarding, perhaps unfairly, recent work on PyXPCOM to integrate Python more tightly with Mozilla, there are various packages which do access browser DOMs: if the questioner uses a KDE desktop and isn't averse to installing some packages, there's qtxmldom [1] which can access the DOM in Konqueror in association with the kpartplugins distribution [2]; otherwise, I believe there's a Python package for accessing Internet Explorer's DOM. And outside browsers, one can still use various packages already mentioned, in addition to libxml2dom [3] which provides support via libxml2 for reading HTML and XML, producing a DOM which resembles the standardised DOM typically available to JavaScript. It shouldn't be forgotten that PyXML also supports HTML parsing [4], either. Paul [1] http://www.boddie.org.uk/python/qtxmldom.html [2] http://www.boddie.org.uk/python/kpartplugins.html [3] http://www.boddie.org.uk/python/libxml2dom.html [4] http://www.boddie.org.uk/python/HTML.html -- http://mail.python.org/mailman/listinfo/python-list