Ezio Melotti added the comment:

This is what I found out.
I used an easily copy/pastable one-liner that creates 3 variables: e (no 
children), e2 (3 children), e3 (5 children).

Original leaky code (test_xml_etree_c leaked [56, 56] references, sum=112):
>>> from xml.etree import ElementTree as ET; e = ET.Element('foo'); SAMPLE_XML 
>>> = "<body><tag class='a'>text</tag><tag class='b' /><section><tag class='b' 
>>> id='inner'>subtext</tag></section></body>"; e2 = ET.XML(SAMPLE_XML); 
>>> SAMPLE_XML = "<body><tag class='a'>text</tag><tag class='b' /><tag 
>>> class='b' /><tag class='b' /><section><tag class='b' 
>>> id='inner'>subtext</tag></section></body>"; e3 = ET.XML(SAMPLE_XML)
[76773 refs]
>>> ### e has no children and leaks 1 ref:
>>> e.__getstate__()
{'tag': 'foo', 'attrib': {}, 'text': None, 'tail': None, '_children': []}
[76791 refs]
>>> e.__getstate__()
{'tag': 'foo', 'attrib': {}, 'text': None, 'tail': None, '_children': []}
[76792 refs]
>>> e.__getstate__()
{'tag': 'foo', 'attrib': {}, 'text': None, 'tail': None, '_children': []}
[76793 refs]
>>> ### e2 has 3 children and leaks 4 refs:
>>> e2.__getstate__()
{'tag': 'body', 'attrib': {}, 'text': None, 'tail': None, '_children': 
[<Element 'tag' at 0xb735cef4>, <Element 'tag' at 0xb7368034>, <Element 
'section' at 0xb73688f4>]}
[76798 refs]
>>> e2.__getstate__()
{'tag': 'body', 'attrib': {}, 'text': None, 'tail': None, '_children': 
[<Element 'tag' at 0xb735cef4>, <Element 'tag' at 0xb7368034>, <Element 
'section' at 0xb73688f4>]}
[76802 refs]

The leaked refs seems to be 1 for the children list + 1 for each children.
The diff I pasted in the previous *seems* to fix this (i.e. leaks in 
__gestate__ are gone, tests pass), but I had it crash once (couldn't reproduce 
after that, so it might be unrelated*), and I'm not sure it's correct.

With that patch applied we go down to test_xml_etree_c leaked [6, 6] 
references, sum=12.
The remaining leaks seem to be in __setstate__.
Patched code:
>>> from xml.etree import ElementTree as ET; e = ET.Element('foo'); SAMPLE_XML 
>>> = "<body><tag class='a'>text</tag><tag class='b' /><section><tag class='b' 
>>> id='inner'>subtext</tag></section></body>"; e2 = ET.XML(SAMPLE_XML); 
>>> SAMPLE_XML = "<body><tag class='a'>text</tag><tag class='b' /><tag 
>>> class='b' /><tag class='b' /><section><tag class='b' 
>>> id='inner'>subtext</tag></section></body>"; e3 = ET.XML(SAMPLE_XML)
[76773 refs]
>>> ### no more leaks for getstate:
>>> p = e.__getstate__()
[76787 refs]
>>> p = e.__getstate__()
[76787 refs]
>>> ### also no leaks when there are no child:
>>> e.__setstate__(p)
[76788 refs]
>>> e.__setstate__(p)
[76788 refs]
>>> ### no more leaks for getstate with children:
>>> p2 = e2.__getstate__()
[76807 refs]
>>> p2 = e2.__getstate__()
[76807 refs]
>>> ### one ref leaked for every child in __setstate__:
>>> e2.__setstate__(p2)
[76810 refs]
>>> e2.__setstate__(p2)
[76813 refs]
>>> e2.__setstate__(p2)
[76816 refs]

I'm not working on this anymore now, so someone more familiar with the code can 
take a look, see if my patch is correct, and fix the remaining leaks.

* maybe I'm doing something wrong, but ISTM that ``make -j2`` doesn't always 
work right away, and sometimes I get different results if I do it again without 
touching the code.

----------
stage: committed/rejected -> needs patch
status: closed -> open

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16076>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to