Le dimanche 28 janvier 2024, 22:57:02 CET Tomas Vondra a écrit :

Hi Tomas !

I'll comment on glibc-malloc part as I studied that part last year, and 
proposed some things here: https://www.postgresql.org/message-id/
3424675.QJadu78ljV%40aivenlaptop


> FWIW where does the malloc overhead come from? For one, while we do have
> some caching of malloc-ed memory in memory contexts, that doesn't quite
> work cross-query, because we destroy the contexts at the end of the
> query. We attempt to cache the memory contexts too, but in this case
> that can't help because the allocations come from btbeginscan() where we
> do this:
> 
>     so = (BTScanOpaque) palloc(sizeof(BTScanOpaqueData));
> 
> and BTScanOpaqueData is ~27kB, which means it's an oversized chunk and
> thus always allocated using a separate malloc() call. Maybe we could
> break it into smaller/cacheable parts, but I haven't tried, and I doubt
> > > > it's the only such allocation.

Did you try running an strace on the process ? That may give you some 
hindsights into what malloc is doing. A more sophisticated approach would be 
using stap and plugging it into the malloc probes, for example 
memory_sbrk_more and memory_sbrk_less. 

An important part of glibc's malloc behaviour in that regard comes from the 
adjustment of the mmap and free threshold. By default, mmap adjusts them 
dynamically and you can poke into that using the 
memory_mallopt_free_dyn_thresholds probe.

> 
> FWIW I was wondering if this is a glibc-specific malloc bottleneck, so I
> tried running the benchmarks with LD_PRELOAD=jemalloc, and that improves
> the behavior a lot - it gets us maybe ~80% of the mempool benefits.
> Which is nice, it confirms it's glibc-specific (I wonder if there's a
> way to tweak glibc to address this), and it also means systems using
> jemalloc (e.g. FreeBSD, right?) don't have this problem. But it also
> says the mempool has ~20% benefit on top of jemalloc.

GLIBC's malloc offers some tuning for this. In particular, setting either 
M_MMAP_THRESHOLD or M_TRIM_THRESHOLD will disable the unpredictable "auto 
adjustment" beheviour and allow you to control what it's doing. 

By setting a bigger M_TRIM_THRESHOLD, one can make sure memory allocated using 
sbrk isn't freed as easily, and you don't run into a pattern of moving the 
sbrk pointer up and down repeatedly. The automatic trade off between the mmap 
and trim thresholds is supposed to prevent that, but the way it is incremented 
means you can end in a bad place depending on your particular allocation 
patttern.

Best regards,

--
Ronan Dunklau





Reply via email to