Fredrik Lundh wrote:
> by using it to split your document into reasonably-sized chunks (one 
> record, one expression, one text block, one paragraph, etc), and using 
> Python code to process the chunks.

I've updated cElementTree/iterparse implementation to build one full
expression at a time.
http://sreeram.cc/files/xmlspeed/py_etree.py

Here are the updated timings:

Input file size 80mb:
C/Expat:              4.25 secs
Python/cElementTree:  11.78 secs (down from 15.52 secs)
Python/pyexpat:       16.10 secs

Input file size 800mb:
C/Expat:              105 secs
Python+cElementTree:  157 secs (down from 184 secs)
Python+pyexpat:       191 secs


Regards
Sreeram

Attachment: signature.asc
Description: OpenPGP digital signature

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to