thanks malcolm, jay. your suggestions are very helpful. patrick
Am 17.09.2006 um 19:15 schrieb Jay Parlar: > > On 9/17/06, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote: >> Is your input guaranteed to be well-formed XHTML? If so, ElementTree >> (http://effbot.org/zone/element-index.htm ) will be faster, >> particularly >> cElementTree. It always feels very Pythonic when you program with >> it, so >> it gets ease-of-use points. >> >> BeautifulSoup is a lifesaver when you need to process HTML that >> might be >> not particularly well constructed and I like its functionality in >> that >> area. I haven't used it in very heavy multi-process environments, >> so I >> must admit that the memory usage isn't something I've worried >> about too >> much. Not sure who comfortable it is to write out something that >> BeautifulSoup has parsed -- you'll need to write your own serialiser >> (ElementTree has SimpleXMLWriter) -- but that shouldn't be a >> showstopper. >> >> For BeautifulSoup you are going to have to write a tree walker to >> process the nodes. ElementTree-based code could be handled in the >> same >> fashion, but the iterparse() method for processing as you parse is my >> favourite way of working where I have to act on potentially all the >> input. >> > > > Or if you want the best of both worlds, you can try Fredrick's > ElementSoup (http://effbot.org/zone/element-soup.htm). It produces an > ElementTree wrapper around BeautifulSoup's output. > > Jay P. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users -~----------~----~----~----~------~----~------~--~---