* Jonathon Anderson: > On Sun, Oct 27, 2019 at 09:59, Florian Weimer <f...@deneb.enyo.de> wrote: >> * Mark Wielaard: >> >>>> Current glibc versions have a thread-local fast path, which should >>>> address some of these concerns. It's still not a bump-pointer >>>> allocator, but at least there are no atomics on that path. >>> >>> Since which version of glibc is there a thread-local fast path? >> >> It was added in: >> >> commit d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc >> Author: DJ Delorie <d...@delorie.com <mailto:d...@delorie.com>> >> Date: Thu Jul 6 13:37:30 2017 -0400 >> >> Add per-thread cache to malloc >> >> So glibc 2.26. But it is a build-time option, enabled by default, but >> it can be switched off by distributions. > > I doubt any non-mobile distros would disable it, the cost seems fairly > small.
It increases fragmentation. Vmware's Photon distribution disables it. > My main concern is that it seems like chunks will only enter the > thread-local cache in the presence of free()s (since they have to enter > the "smallbins" or "fastbins" first, and those two at a glance seem to > be filled very lazily or on free()); since the free()s are all on > dwarf_end this would pose an issue. I could also be entirely mistaken, > glibc is by no means a simple piece of code. No, there is a prefill step if the cache is empty, where the cache is populated with one arena allocation which is then split up. > According to the comments, there might also be a 16 byte overhead per > allocation, which would explode the small allocations considerably. Available allocatable sizes in bytes are congruent 8 modulo (16), and the smallest allocatable size is 24. In general, the overhead is 8 bytes. (All numbers are for 64-bit architectures.)