On 9/17/06, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote: > Is your input guaranteed to be well-formed XHTML? If so, ElementTree > (http://effbot.org/zone/element-index.htm ) will be faster, particularly > cElementTree. It always feels very Pythonic when you program with it, so > it gets ease-of-use points. > > BeautifulSoup is a lifesaver when you need to process HTML that might be > not particularly well constructed and I like its functionality in that > area. I haven't used it in very heavy multi-process environments, so I > must admit that the memory usage isn't something I've worried about too > much. Not sure who comfortable it is to write out something that > BeautifulSoup has parsed -- you'll need to write your own serialiser > (ElementTree has SimpleXMLWriter) -- but that shouldn't be a > showstopper. > > For BeautifulSoup you are going to have to write a tree walker to > process the nodes. ElementTree-based code could be handled in the same > fashion, but the iterparse() method for processing as you parse is my > favourite way of working where I have to act on potentially all the > input. >
Or if you want the best of both worlds, you can try Fredrick's ElementSoup (http://effbot.org/zone/element-soup.htm). It produces an ElementTree wrapper around BeautifulSoup's output. Jay P. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users -~----------~----~----~----~------~----~------~--~---