Thanks for all the replies! Following Chris's advice, I tried to reduce the code to the smallest reproducible example (I guess I should have done it sooner), but here's what I came up with: ``` #include <cstring> #include <Python.h>
static int my_init(PyObject*, PyObject*, PyObject*) { return 0; } static void my_dealloc(PyObject*) {} static void init_mytype(PyObject* module) { PyTypeObject* type = new PyTypeObject(); std::memset(type, 0, sizeof(PyTypeObject)); Py_INCREF(type); type->tp_basicsize = static_cast<Py_ssize_t>(sizeof(PyObject)); type->tp_itemsize = 0; type->tp_flags = Py_TPFLAGS_DEFAULT; type->tp_new = &PyType_GenericNew; type->tp_name = "mytype"; type->tp_doc = "[temporary]"; type->tp_init = my_init; type->tp_dealloc = my_dealloc; PyType_Ready(type); PyModule_AddObject(module, "mytype", reinterpret_cast<PyObject*>(type)); } ``` (my original `update` object had some fields in it, but it turns out they don't need to be present in order for the problem to manifest. So in this case I'm creating a custom object which is the same as basic PyObject). The `init_mytype()` function creates a custom type and attaches it to a module. After this, creating 100M instances of the object will cause the process memory to swell to 1.5G: ``` for i in range(10**8): z = dt.mytype() ``` I know this is not normal because if instead i used a builtin type such as `list`, or a python-defined class such as `class A: pass`, then the process will remain at steady RAM usage of about 6Mb. I've tested this on a Linux platform as well (using docker image quay.io/pypa/manylinux2010_x86_64), and the problem is present there as well. --- PS: The library I'm working on is open source, available at https://github.com/h2oai/datatable, but the code I posted above is completely independent from my library. On Fri, Oct 23, 2020 at 10:44 AM Dieter Maurer <die...@handshake.de> wrote: > Pasha Stetsenko wrote at 2020-10-22 17:51 -0700: > > ... > >I'm a maintainer of a python library "datatable" (can be installed from > >PyPi), and i've been recently trying to debug a memory leak that occurs in > >my library. > >The program that exposes the leak is quite simple: > >``` > >import datatable as dt > >import gc # just in case > > > >def leak(n=10**7): > > for i in range(n): > > z = dt.update() > > > >leak() > >gc.collect() > >input("Press enter") > >``` > >Note that despite the name, the `dt.update` is actually a class, though it > >is defined via Python C API. Thus, this script is expected to create and > >then immediately destroy 10 million simple python objects. > >The observed behavior, however, is that the script consumes more and > more > >memory, eventually ending up at about 500M. The amount of memory the > >program ends up consuming is directly proportional to the parameter `n`. > > > >The `gc.get_objects()` does not show any extra objects however. > > For efficiency reasons, the garbage collector treats only > objects from types which are known to be potentially involved in cycles. > A type implemented in "C" must define `tp_traverse` (in its type > structure) to indicate this possibility. > `tp_traverse` also tells the garbage collector how to find referenced > objects. > You will never find an object in the result of `get_objects` the > type of which does not define `tp_traverse`. > > > ... > >Thus, the object didn't actually "leak" in the normal sense: its refcount > >is 0 and it was reclaimed by the Python runtime (when i print a debug > >message in tp_dealloc, i see that the destructor gets called every time). > >Still, Python keeps requesting more and more memory from the system > instead > >of reusing the memory that was supposed to be freed. > > I would try to debug what happens further in `tp_dealloc` and its callers. > You should eventually see a `PyMem_free` which gives the memory back > to the Python memory management (built on top of the C memory management). > > Note that your `tp_dealloc` should not call the "C" library's "free". > Python builds its own memory management (--> "PyMem_*") on top > of the "C" library. It handles all "small" memory requests > and, if necessary, requests big data chunks via `malloc` to split > them into the smaller sizes. > Should you "free" small memory blocks directly via "free", that memory > becomes effectively unusable by Python (unless you have a special > allocation as well). > -- https://mail.python.org/mailman/listinfo/python-list