Terry Reedy, 14.12.2011 06:01:
On 12/13/2011 6:21 PM, Ethan Furman wrote:
In the near future I will need to parse and rewrite parts of a xml files
created by a third-party program (PrintShopMail, for the curious).
It contains both binary and textual data.

There has been some strong debate about the merits of minidom vs
ElementTree.

Recommendations?

People's reaction to the DOM interface seem quite varied, with a majority,
perhaps, being negative. I personally would look at both enough to
understand the basic API model to see where *I* fit.

The API is one thing, yes, but there's also the fact that MiniDOM doesn't scale. If your XML files are of a notable size (a couple of MB), MiniDOM may simply not be able to handle them. I collected some numbers in a blog post. Note that this is using a recent CPython 3.3 build which has an optimised Unicode string implementation, thus yielding lower memory requirements on average than Py2.x.

http://blog.behnel.de/index.php?p=197

The memory consumption makes a difference of a factor of 5-10 compared to cElementTree, which makes it two orders of magnitude larger than the size of the serialised file. You may be able to stuff one such file into memory, but you'll quickly get into trouble when you try to do parallel processing or otherwise use more than one document at a time.

Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to