Some snippets (more line breaks for clarity) from https://sourceware.org/bugzilla/show_bug.cgi?id=30945
""" <...> The deserialization library we used works 10x slower than expected. Investigations show that this is due to the arena_get2 function uses __get_nprocs_sched instead of __get_nprocs. Without changing core affinity settings, this call returns the real number of cores so the upper limit of total arenas is set correctly. However, if a thread is pinned to a core, further malloc calls only sees n = 1 because the function returns only schedulable cores. Therefore, the maximum number of arenas will be 8 on 64-bit platforms. <...> ./a.out 32 false false --- nr_cpu: 32 pin: no fix: no thread average (ms): 16.233663 ./a.out 32 true false --- nr_cpu: 32 pin: yes fix: no thread average (ms): 1360.919047 <...> Fixed on 2.39. <...> I backported it to 2.34, 2.35, 2.36, 2.37, and 2.38. """ ** Bug watch added: Sourceware.org Bugzilla #30945 https://sourceware.org/bugzilla/show_bug.cgi?id=30945 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089789 Title: malloc performance degradation with CPU affinity masks To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/2089789/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs