Thanks for all the replies!
Following Chris's advice, I tried to reduce the code to the smallest
reproducible example (I guess I should have done it sooner),
but here's what I came up with:
  #include <cstring>
  #include <Python.h>

  static int my_init(PyObject*, PyObject*, PyObject*) { return 0; }
  static void my_dealloc(PyObject*) {}

  static void init_mytype(PyObject* module) {
    PyTypeObject* type = new PyTypeObject();
    std::memset(type, 0, sizeof(PyTypeObject));

    type->tp_basicsize = static_cast<Py_ssize_t>(sizeof(PyObject));
    type->tp_itemsize = 0;
    type->tp_flags = Py_TPFLAGS_DEFAULT;
    type->tp_new   = &PyType_GenericNew;
    type->tp_name  = "mytype";
    type->tp_doc   = "[temporary]";
    type->tp_init  = my_init;
    type->tp_dealloc = my_dealloc;
    PyModule_AddObject(module, "mytype", reinterpret_cast<PyObject*>(type));
(my original `update` object had some fields in it, but it turns out they
don't need to be present in order for the problem to manifest. So in this
case I'm creating a custom object which is the same as basic PyObject).
The `init_mytype()` function creates a custom type and attaches it to a
module. After this, creating 100M instances of the object will cause the
process memory to swell to 1.5G:
for i  in range(10**8):
    z = dt.mytype()
I know this is not normal because if instead i used a builtin type such as
`list`, or a python-defined class such as `class  A: pass`, then the
process will remain at steady RAM usage of about 6Mb.

I've  tested this on a Linux platform as well (using docker image, and the problem is present there as

PS: The library I'm working on is open source, available at, but the code I posted  above is
completely independent from my library.

On Fri, Oct 23, 2020 at 10:44 AM Dieter Maurer <> wrote:

> Pasha Stetsenko wrote at 2020-10-22 17:51 -0700:
> > ...
> >I'm a maintainer of a python library "datatable" (can be installed from
> >PyPi), and i've been recently trying to debug a memory leak that occurs in
> >my library.
> >The program that exposes the leak is quite simple:
> >```
> >import datatable as dt
> >import gc  # just in case
> >
> >def leak(n=10**7):
> >    for i in range(n):
> >        z = dt.update()
> >
> >leak()
> >gc.collect()
> >input("Press enter")
> >```
> >Note that despite the name, the `dt.update` is actually a class, though it
> >is defined via Python C API. Thus, this script is expected to create and
> >then immediately destroy 10 million simple python objects.
> >The observed behavior, however,  is  that the script consumes more and
> more
> >memory, eventually ending up at about 500M. The amount of memory the
> >program ends up consuming is directly proportional to the parameter `n`.
> >
> >The `gc.get_objects()` does not show any extra objects however.
> For efficiency reasons, the garbage collector treats only
> objects from types which are known to be potentially involved in cycles.
> A type implemented in "C" must define `tp_traverse` (in its type
> structure) to indicate this possibility.
> `tp_traverse` also tells the garbage collector how to find referenced
> objects.
> You will never find an object in the result of `get_objects` the
> type of which does not define `tp_traverse`.
> > ...
> >Thus, the object didn't actually "leak" in the normal sense: its refcount
> >is 0 and it was reclaimed by the Python runtime (when i print a debug
> >message in tp_dealloc, i see that the destructor gets called every time).
> >Still, Python keeps requesting more and more memory from the system
> instead
> >of reusing the memory  that was supposed to be freed.
> I would try to debug what happens further in `tp_dealloc` and its callers.
> You should eventually see a `PyMem_free` which gives the memory back
> to the Python memory management (built on top of the C memory management).
> Note that your `tp_dealloc` should not call the "C" library's "free".
> Python builds its own memory management (--> "PyMem_*") on top
> of the "C" library. It handles all "small" memory requests
> and, if necessary, requests big data chunks via `malloc` to split
> them into the smaller sizes.
> Should you "free" small memory blocks directly via "free", that memory
> becomes effectively unusable by Python (unless you have a special
> allocation as well).

