New submission from Giuseppe Attardi :
I confirm the presence of a serious memory leak in ElementTree, using the
iterparse() function.
Memory grows disproportionately to dozens of GB when parsing a large XML file.
For further information, see discussion in:
http://www.gossamer-threads.com/lists/python/bugs/912164?do=post_view_threaded#912164
but notice that the comments attributing the problem to the OS are quite off
the mark.
To replicate the problem, try this on a Wikipedia dump:
iterparse = ElementTree.iterparse(file)
id = None
for event, elem in iterparse:
if elem.tag.endswith("title"):
title = elem.text
elif elem.tag.endswith("id") and not id:
id = elem.text
elif elem.tag.endswith("text"):
print id, title, elem.text[:20]
--
messages: 160266
nosy: Giuseppe.Attardi
priority: normal
severity: normal
status: open
title: ElementTree memory leak
type: resource usage
versions: Python 2.7
___
Python tracker
<http://bugs.python.org/issue14762>
___
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com