Re: scripting browsers from Python

John J. Lee Fri, 03 Jun 2005 11:45:32 -0700

Olivier Favre-Simon <[EMAIL PROTECTED]> writes:
[...]
> > I'd like to have a reimplementation of ClientForm on top of something
> > like BeautifulSoup...
> > 
> > 
> > John
> 
> When taken separately, either ClientForm, HTMLParser or SGMLParser work
> well.
> 
> But it would be cool that competent people in the HTML parsing domain join
> up, and define a base parser interface, the same way smart guys did with
> WSGI for webservers.


Perhaps.  Given a mythical fixed quantity of volunteer coding effort I
could assign to any HTML parsing project, I'd really prefer that
somebody separated out the HTML parsing, tree building and DOM code
from Mozilla and/or Konqueror.


> So libs like ClientForm would not raise say an AttributeError if some
> custom parser class does not implement a given attribute.
> 
> Adding an otherwise unused attribute to a parser just in case one day it
> will interop with ClientForm sounds silly. And what if ClientForm changes
> its attributes, etc.
[...]

I'm sorry, I didn't really follow that at all.

What I hoped to get from implementing the ClientForm interface on top
of something like BeautifulSoup was actually two things:

1. Better parsing

2. Access to a nice, and comprehensive, object model that lets you do
   things with non-form elements, and the ability to move back and
   forth between ClientForm and BeautifulSoup objects.  I already did
   this for the HTML DOM with DOMForm (unsupported), but for various
   reasons the implementation is horrid, and since I no longer intend
   to put in the effort to support JavaScript, I'd prefer a nicer tree
   API than the DOM.


John
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: scripting browsers from Python

Reply via email to