Ezio Melotti added the comment: This is what I found out. I used an easily copy/pastable one-liner that creates 3 variables: e (no children), e2 (3 children), e3 (5 children).
Original leaky code (test_xml_etree_c leaked [56, 56] references, sum=112): >>> from xml.etree import ElementTree as ET; e = ET.Element('foo'); SAMPLE_XML >>> = "<body><tag class='a'>text</tag><tag class='b' /><section><tag class='b' >>> id='inner'>subtext</tag></section></body>"; e2 = ET.XML(SAMPLE_XML); >>> SAMPLE_XML = "<body><tag class='a'>text</tag><tag class='b' /><tag >>> class='b' /><tag class='b' /><section><tag class='b' >>> id='inner'>subtext</tag></section></body>"; e3 = ET.XML(SAMPLE_XML) [76773 refs] >>> ### e has no children and leaks 1 ref: >>> e.__getstate__() {'tag': 'foo', 'attrib': {}, 'text': None, 'tail': None, '_children': []} [76791 refs] >>> e.__getstate__() {'tag': 'foo', 'attrib': {}, 'text': None, 'tail': None, '_children': []} [76792 refs] >>> e.__getstate__() {'tag': 'foo', 'attrib': {}, 'text': None, 'tail': None, '_children': []} [76793 refs] >>> ### e2 has 3 children and leaks 4 refs: >>> e2.__getstate__() {'tag': 'body', 'attrib': {}, 'text': None, 'tail': None, '_children': [<Element 'tag' at 0xb735cef4>, <Element 'tag' at 0xb7368034>, <Element 'section' at 0xb73688f4>]} [76798 refs] >>> e2.__getstate__() {'tag': 'body', 'attrib': {}, 'text': None, 'tail': None, '_children': [<Element 'tag' at 0xb735cef4>, <Element 'tag' at 0xb7368034>, <Element 'section' at 0xb73688f4>]} [76802 refs] The leaked refs seems to be 1 for the children list + 1 for each children. The diff I pasted in the previous *seems* to fix this (i.e. leaks in __gestate__ are gone, tests pass), but I had it crash once (couldn't reproduce after that, so it might be unrelated*), and I'm not sure it's correct. With that patch applied we go down to test_xml_etree_c leaked [6, 6] references, sum=12. The remaining leaks seem to be in __setstate__. Patched code: >>> from xml.etree import ElementTree as ET; e = ET.Element('foo'); SAMPLE_XML >>> = "<body><tag class='a'>text</tag><tag class='b' /><section><tag class='b' >>> id='inner'>subtext</tag></section></body>"; e2 = ET.XML(SAMPLE_XML); >>> SAMPLE_XML = "<body><tag class='a'>text</tag><tag class='b' /><tag >>> class='b' /><tag class='b' /><section><tag class='b' >>> id='inner'>subtext</tag></section></body>"; e3 = ET.XML(SAMPLE_XML) [76773 refs] >>> ### no more leaks for getstate: >>> p = e.__getstate__() [76787 refs] >>> p = e.__getstate__() [76787 refs] >>> ### also no leaks when there are no child: >>> e.__setstate__(p) [76788 refs] >>> e.__setstate__(p) [76788 refs] >>> ### no more leaks for getstate with children: >>> p2 = e2.__getstate__() [76807 refs] >>> p2 = e2.__getstate__() [76807 refs] >>> ### one ref leaked for every child in __setstate__: >>> e2.__setstate__(p2) [76810 refs] >>> e2.__setstate__(p2) [76813 refs] >>> e2.__setstate__(p2) [76816 refs] I'm not working on this anymore now, so someone more familiar with the code can take a look, see if my patch is correct, and fix the remaining leaks. * maybe I'm doing something wrong, but ISTM that ``make -j2`` doesn't always work right away, and sometimes I get different results if I do it again without touching the code. ---------- stage: committed/rejected -> needs patch status: closed -> open _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16076> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com