Albert Leibbrandt wrote: > Hi > > Just want to check which xml parser you guys have found to be the > quickest. I have xml documents with 250 000 records or more and the > processing of these documents are taking way to long. The validation is > the main problem. Any module names, non validating would be find to, > would help a lot.
It would help us help you if you posted samples of the target docs. XML processing strategy often depends on the structure of the XML, just as relational query optimization strategy often depends on the schema. In general SAX or iterative tree-callback methods will give you the best speed. Fredrik already mentioned ElementTree's IterParse. Amara's pushbind and pushdom and 4Suite's Saxlette (which has some neat callback features) are other options. http://uche.ogbuji.net/tech/4suite/amara/ http://4suite.org/docs/CoreManual.xml#saxlette -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ -- http://mail.python.org/mailman/listinfo/python-list