Olivier Favre-Simon <[EMAIL PROTECTED]> writes: [...] > > I'd like to have a reimplementation of ClientForm on top of something > > like BeautifulSoup... > > > > > > John > > When taken separately, either ClientForm, HTMLParser or SGMLParser work > well. > > But it would be cool that competent people in the HTML parsing domain join > up, and define a base parser interface, the same way smart guys did with > WSGI for webservers.
Perhaps. Given a mythical fixed quantity of volunteer coding effort I could assign to any HTML parsing project, I'd really prefer that somebody separated out the HTML parsing, tree building and DOM code from Mozilla and/or Konqueror. > So libs like ClientForm would not raise say an AttributeError if some > custom parser class does not implement a given attribute. > > Adding an otherwise unused attribute to a parser just in case one day it > will interop with ClientForm sounds silly. And what if ClientForm changes > its attributes, etc. [...] I'm sorry, I didn't really follow that at all. What I hoped to get from implementing the ClientForm interface on top of something like BeautifulSoup was actually two things: 1. Better parsing 2. Access to a nice, and comprehensive, object model that lets you do things with non-form elements, and the ability to move back and forth between ClientForm and BeautifulSoup objects. I already did this for the HTML DOM with DOMForm (unsupported), but for various reasons the implementation is horrid, and since I no longer intend to put in the effort to support JavaScript, I'd prefer a nicer tree API than the DOM. John -- http://mail.python.org/mailman/listinfo/python-list