Fredrik Lundh wrote: > by using it to split your document into reasonably-sized chunks (one > record, one expression, one text block, one paragraph, etc), and using > Python code to process the chunks.
I've updated cElementTree/iterparse implementation to build one full expression at a time. http://sreeram.cc/files/xmlspeed/py_etree.py Here are the updated timings: Input file size 80mb: C/Expat: 4.25 secs Python/cElementTree: 11.78 secs (down from 15.52 secs) Python/pyexpat: 16.10 secs Input file size 800mb: C/Expat: 105 secs Python+cElementTree: 157 secs (down from 184 secs) Python+pyexpat: 191 secs Regards Sreeram
signature.asc
Description: OpenPGP digital signature
-- http://mail.python.org/mailman/listinfo/python-list