On 9/17/06, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote:
> Is your input guaranteed to be well-formed XHTML? If so, ElementTree
> (http://effbot.org/zone/element-index.htm ) will be faster, particularly
> cElementTree. It always feels very Pythonic when you program with it, so
> it gets ease-of-use points.
>
> BeautifulSoup is a lifesaver when you need to process HTML that might be
> not particularly well constructed and I like its functionality in that
> area. I haven't used it in very heavy multi-process environments, so I
> must admit that the memory usage isn't something I've worried about too
> much. Not sure who comfortable it is to write out something that
> BeautifulSoup has parsed -- you'll need to write your own serialiser
> (ElementTree has SimpleXMLWriter) -- but that shouldn't be a
> showstopper.
>
> For BeautifulSoup you are going to have to write a tree walker to
> process the nodes. ElementTree-based code could be handled in the same
> fashion, but the iterparse() method for processing as you parse is my
> favourite way of working where I have to act on potentially all the
> input.
>


Or if you want the best of both worlds, you can try Fredrick's
ElementSoup (http://effbot.org/zone/element-soup.htm). It produces an
ElementTree wrapper around BeautifulSoup's output.

Jay P.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~----------~----~----~----~------~----~------~--~---

Reply via email to