New submission from Jess Johnson <j...@grokcode.com>:
When given xml that that would raise a ParseError, but parsing is stopped before the ParseError is raised, xml.etree.ElementTree.iterparse leaks memory. Example: import gc from io import StringIO import xml.etree.ElementTree as etree import objgraph def parse_xml(): xml = """ <LEVEL1> </LEVEL1> </ROOT> """ parser = etree.iterparse(StringIO(initial_value=xml)) for _, elem in parser: if elem.tag == 'LEVEL1': break def run(): parse_xml() gc.collect() uncollected_elems = objgraph.by_type('Element') print(uncollected_elems) objgraph.show_backrefs(uncollected_elems, max_depth=15) if __name__ == "__main__": run() Output: [<Element 'LEVEL1' at 0x10df712c8>] Also see this gist which has an image showing the objects that are retained in memory: https://gist.github.com/grokcode/f89d5c5f1831c6bc373be6494f843de3 ---------- components: XML messages: 331861 nosy: jess.j priority: normal severity: normal status: open title: Memory leak in xml.etree.ElementTree.iterparse type: resource usage versions: Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35502> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com