Hi folks,
Quick Synopsis: A test script demonstrates a memory leak when I use pythonic extensions of my builtin types, but if I use the builtin types themselves there is no memory leak. If you are interested in how builtin/pure-python inheritance interacts with the gc, maybe you can help me fix my memory leak. Background: I'm creating an extension module (see [1]) which implements the LADSPA host interface (see [2]). A similar project appears to do the same using pyrex (see [3]), but I have not investigated it yet. I've discovered a memory leak and I'm fairly certain of the where the culprit reference cycle is. The problem is I am not sure how to instruct the gc to handle it, even after reading about supporting builtin containers (see [4]). My question: How do I correctly implement a builtin type and its pythonic extension type such that reference cycles are detected and collected? Details: There are three relevant LADSPA types: A "handle" is an audio processing module which operates on input and output streams. A "descriptor" describes the interface to a handle such as the types in the I/O streams, and their names and descriptions. A handle is the instantiation of a particular descriptor. A "plugin" is a container providing zero or more descriptors which can be dynamically loaded. (Typically a shared object library that the LADSPA host loads at runtime.) The python-ladspa package is designed as follows: There is a low-level builtin extension module, called "_ladspa", and a high-level interface module, called "ladspa". For each of the three LADSPA types, there is an extension type in the "_ladpsa" module, for example: "_ladspa.Descriptor". For each extension type there is a high-level subtype in the "ladspa" module. So "ladspa.Handle" inherits from "_ladspa.Handle". The reference structure of the "_ladspa" module is tree-like with no cycles. This is because a handle has a single reference to its descriptor, and a descriptor has a single reference to its plugin. However, the high-level interface introduces a reference cycle because this makes the interface more natural, IMO. I've created a diagram which attempts to represent the inheritance relationships as well as the reference structure of this wrapper (see [5]). Let me know if you find it clarifies things or needs improvement. There is a test script, "memleak.py", which tests either module by repeatedly instantiating and discarding references to handles (see [6]). When configured to use the "_ladspa" module, there appears to be no memory usage growth, but if using the "ladspa" module, memory grows linearly with the number of iterations. If I comment out the "Descriptors" list in the "ladspa.Plugin" class (which removes the reference cycle) then "memleak.py" runs with no apparent memory leak. The leak persists even if I implement traverse and clear methods for all three builtin types, unless I've done this incorrectly. Of course one solution is to do away with the reference cycle. After all the _ladspa extension does not have the cycle and is usable. However I care more about a user-friendly interface (which I believe the reference cycle provides) and also I'm just curious. Thanks for any help, Nejucomo References: [1] http://sourceforge.net/projects/python-ladspa/ [2] http://www.ladspa.org/ [3] http://sourceforge.net/projects/dsptools/ [4] http://www.python.org/doc/2.3.5/ext/node24.html [5] http://python-ladspa.svn.sourceforge.net/viewvc/python-ladspa/doc/refgraph.png?revision=41&view=markup [6] http://python-ladspa.svn.sourceforge.net/viewvc/python-ladspa/test/memleak.py?revision=37&view=markup -- http://mail.python.org/mailman/listinfo/python-list