New submission from STINNER Victor <vstin...@redhat.com>:
When PYTHONMALLOC=debug environment variable or -X dev command line option is used, Python installs debug hooks on memory allocators which add 2 size_t before and 2 size_t after each memory block: it adds 32 bytes to every memory allocation. I'm debugging crashes and memory leaks in CPython for 10 years, and I simply never had to use "serialno". So I simply propose attached pull request to remove it to reduce the memory footprint: I measured a reduction around -5% (ex: 1.2 MiB on 33.0 MiB when running test_asyncio). A smaller memory footprint allows to use this feature on devices with small memory, like embedded devices. The change also fix race condition in debug memory allocators: bpo-31473, "Debug hooks on memory allocators are not thread safe (serialno variable)". Using tracemalloc, it is already possible (since Python 3.6) to find where a memory block has been allocated, and so decide where to put a breakpoint when debugging. If someone cares about the "serialno" field, maybe we can keep code using a compilation flag, like a C #define. "serialno" is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." But again, I never used it... -- Some examples of the *peak* memory usage without => with the change: * -c pass: 2321.8 kB => 2437.1 kB (-115.3 kiB, -5%) * -m test test_os test_sys: 14252.3 kB => 13598.6 kB (-653.7 kiB, -5%) * -m test test_asyncio: 34194.2 kB => 32963.1 kB (-1231.1 kiB, -4%) Command used to measure the memory consumption: $ ./python -i -X tracemalloc -c pass >>> import tracemalloc; print("%.1f kB" % (tracemalloc.get_traced_memory()[1] / >>> 1024.)) With the patch: diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c index c5d5671032..e010c2ef84 100644 --- a/Modules/_tracemalloc.c +++ b/Modules/_tracemalloc.c @@ -582,6 +582,8 @@ tracemalloc_add_trace(unsigned int domain, uintptr_t ptr, _Py_hashtable_entry_t* entry; int res; + size += 4 * sizeof(size_t); + assert(_Py_tracemalloc_config.tracing); traceback = traceback_new(); Replace 4 with 3 to measure memory used with the change. -- Since Python 3.6, when the debug memory allocator detects a bug (ex: buffer overflow), it now also displays the Python traceback where the memory block has been allocated if tracemalloc is tracing Python memory allocations. Example with buffer_overflow.py: --- import _testcapi def func(): _testcapi.pymem_buffer_overflow() def main(): func() if __name__ == "__main__": main() --- Output: --- $ ./python -X tracemalloc=10 -X dev bug.py Debug memory block at address p=0x7f45e85c3270: API 'm' 16 bytes originally requested The 7 pad bytes at p-7 are FORBIDDENBYTE, as expected. The 8 pad bytes at tail=0x7f45e85c3280 are not all FORBIDDENBYTE (0xfd): at tail+0: 0x78 *** OUCH at tail+1: 0xfd at tail+2: 0xfd at tail+3: 0xfd at tail+4: 0xfd at tail+5: 0xfd at tail+6: 0xfd at tail+7: 0xfd Data at p: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd Memory block allocated at (most recent call first): File "bug.py", line 4 File "bug.py", line 7 File "bug.py", line 10 Fatal Python error: bad trailing pad byte Current thread 0x00007f45f5660740 (most recent call first): File "bug.py", line 4 in func File "bug.py", line 7 in main File "bug.py", line 10 in <module> Aborted (core dumped) --- The interesting part is "Memory block allocated at (most recent call first):". Traceback reconstructed manually: --- Memory block allocated at (most recent call first): File "bug.py", line 4 _testcapi.pymem_buffer_overflow() File "bug.py", line 7 func() File "bug.py", line 10 main() --- You can see exactly where the memory block has been allocated. Note: Internally, the _PyTraceMalloc_GetTraceback() function is used to get the traceback where a memory block has been allocated. -- Extract of _PyMem_DebugRawAlloc() in Objects/obmalloc.c: /* Let S = sizeof(size_t). The debug malloc asks for 4*S extra bytes and fills them with useful stuff, here calling the underlying malloc's result p: p[0: S] Number of bytes originally asked for. This is a size_t, big-endian (easier to read in a memory dump). p[S] API ID. See PEP 445. This is a character, but seems undocumented. p[S+1: 2*S] Copies of FORBIDDENBYTE. Used to catch under- writes and reads. p[2*S: 2*S+n] The requested memory, filled with copies of CLEANBYTE. Used to catch reference to uninitialized memory. &p[2*S] is returned. Note that this is 8-byte aligned if pymalloc handled the request itself. p[2*S+n: 2*S+n+S] Copies of FORBIDDENBYTE. Used to catch over- writes and reads. p[2*S+n+S: 2*S+n+2*S] A serial number, incremented by 1 on each call to _PyMem_DebugMalloc and _PyMem_DebugRealloc. This is a big-endian size_t. If "bad memory" is detected later, the serial number gives an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out. */ /* Layout: [SSSS IFFF CCCC...CCCC FFFF NNNN] * ^--- p ^--- data ^--- tail S: nbytes stored as size_t I: API identifier (1 byte) F: Forbidden bytes (size_t - 1 bytes before, size_t bytes after) C: Clean bytes used later to store actual data N: Serial number stored as size_t */ The last size_t written at the end of each memory block is "serialno". It is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." ---------- components: Interpreter Core messages: 340019 nosy: vstinner priority: normal severity: normal status: open title: Debug memory allocators: remove useless "serialno" field to reduce memory footprint versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36611> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com