New submission from Jess Johnson <j...@grokcode.com>:

When given xml that that would raise a ParseError, but parsing is stopped 
before the ParseError is raised, xml.etree.ElementTree.iterparse leaks memory.

Example:


import gc
from io import StringIO
import xml.etree.ElementTree as etree

import objgraph


def parse_xml():
    xml = """
      <LEVEL1>
      </LEVEL1>
    </ROOT>
    """
    parser = etree.iterparse(StringIO(initial_value=xml))
    for _, elem in parser:
        if elem.tag == 'LEVEL1':
            break


def run():
    parse_xml()

    gc.collect()
    uncollected_elems = objgraph.by_type('Element')
    print(uncollected_elems)
    objgraph.show_backrefs(uncollected_elems, max_depth=15)


if __name__ == "__main__":
    run()


Output:
[<Element 'LEVEL1' at 0x10df712c8>]

Also see this gist which has an image showing the objects that are retained in 
memory: https://gist.github.com/grokcode/f89d5c5f1831c6bc373be6494f843de3

----------
components: XML
messages: 331861
nosy: jess.j
priority: normal
severity: normal
status: open
title: Memory leak in xml.etree.ElementTree.iterparse
type: resource usage
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35502>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to