On Tue, Dec 3, 2024 at 9:25 PM Heikki Linnakangas <hlinn...@iki.fi> wrote: > > On 20/11/2024 15:33, John Naylor wrote: > I did find one weird thing that makes a big difference: I originally > used AllocSetContextCreate(..., ALLOCSET_DEFAULT_SIZES) for the radix > tree's memory context. With that, XidInMVCCSnapshot() takes about 19% of > the CPU time in that test. When I changed that to ALLOCSET_SMALL_SIZES, > it falls down to the 4% figure. And weird enough, in both cases the time > seems to be spent in the malloc() call from SlabContextCreate(), not > AllocSetContextCreate(). I think doing this particular mix of large and > small allocations with malloc() somehow poisons its free list or > something. So this is probably heavily dependent on the malloc() > implementation. In any case, ALLOCSET_SMALL_SIZES is clearly a better > choice here, even without that effect.
Hmm, interesting. That passed context is needed for 4 things: 1. allocated values (not used here for 64-bit, and 32-bit could be made to work the same way) 2. iteration state (not used here) 3. a convenient place to put slab child contexts so we can free them easily 4. a place to put the "control object" -- this is really only needed for shared memory and I have a personal todo to embed it rather than allocate it for the local memory case. Removing the need for a passed context for callers that don't need it is additional possible future work. Anyway, 0005 looks good to me. -- John Naylor Amazon Web Services