New submission from Eric Snow <ericsnowcurren...@gmail.com>: When an object is created it happens relative to the current thread (ergo interpreter) and the current allocator (part of global state). We do not track either of these details for the object. It may make sense to start doing so (reasons next).
Regarding tracking the interpreter, that originating interpreter can be thought of as the owner. Any lifecycle operations should happen relative to that interpreter. Furthermore, the object should be used in C-API calls only in that interpreter (i.e. when the current thread's Py_ThreadState belongs to that interpreter). This hasn't been an issue since currently all interpreters in the process share the GIL, as well as the fact that subinterpreters haven't been heavily used historically. However, the possibility of no longer sharing the GIL suggests that tracking the owning interpreter (and perhaps even other "sharing" interpreters) would be important. Furthermore, in the last few years subinterpreters have seen increasing usage (see Openstack Ceph), and knowing the originating interpreter for an object can be useful there. Regardless, even in the single interpreter case knowing the owning interpreter is important during runtime finalization (which is currently slightly broken), which impacts CPython embedders. Regarding the allocator, there used to be just a single global one that the runtime used from start to finish. Now the C-API offers a way to switch the allocator, so there's no guarantee that the right allocator is used in PyMem_Free(). This has already had a negative impact on efforts to clean up CPython's runtime initialization. It also results in problems during finalization. Additionally, we are looking into moving the allocator from the global runtime state to the per-interpreter (or even per-thread or per-context) state value. In that world it would be essential to know which allocator was used when creating the object. There are other possible applications based on knowing an object's allocator, but I'll stop there. To sort all this out we would need to track per-object: * originating allocator (pointer or id) * owning interpreter (pointer or id) * (possibly) "sharing" interpreters (linked list?) Either we'd add 2 pointer-size fields to PyObject or we would keep a separate hash table (or two) pointing from each object to the info (similar to how we've considered doing for refcounts). To alleviate impact on the common case (not embedded, single interpreter, same allocator), we could default to not tracking interpreter/allocator and take a lookup failure to mean "main interpreter, default allocator". ---------- messages: 317330 nosy: eric.snow, ncoghlan, vstinner priority: normal severity: normal status: open title: Explicitly track object ownership (and allocator). versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33607> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com