Re: PG15 beta1 sort performance regression due to Generation context change

Andres Freund Tue, 24 May 2022 17:39:34 -0700

Hi,

On 2022-05-24 12:01:58 -0400, Tom Lane wrote:
> David Rowley <[email protected]> writes:
> > On Tue, 24 May 2022 at 09:36, Tom Lane <[email protected]> wrote:
> >> I think probably that could be made to work, since a Generation
> >> context should not contain all that many live blocks at any one time.
> 
> > I've done a rough cut implementation of this and attached it here.
> > I've not done that much testing yet, but it does seem to fix the
> > performance regression that I mentioned in the blog post that I linked
> > in the initial post on this thread.
> 
> Here's a draft patch for the other way of doing it.  I'd first tried
> to make the side-effects completely local to generation.c, but that
> fails in view of code like
> 
>       MemoryContextAlloc(GetMemoryChunkContext(x), ...)
> 
> Thus we pretty much have to have some explicit awareness of this scheme
> in GetMemoryChunkContext().  There's more than one way it could be
> done, but I thought a clean way is to invent a separate NodeTag type
> to identify the indirection case.


That's interesting - I actually needed something vaguely similar recently. For
direct IO support we need to allocate memory with pagesize alignment
(otherwise DMA doesn't work). Several places allocating such buffers also
pfree them.

The easiest way I could see to deal with that was to invent a different memory
context node type that handles allocation / freeing by over-allocating
sufficiently to ensure alignment, backed by an underlying memory context.



A variation on your patch would be to only store the offset to the block
header - that should always fit into 32bit (huge allocations being their own
block, which is why this wouldn't work for storing an offset to the
context). With a bit of care that'd allow aset.c to half it's overhead, by
using 4 bytes of space for all non-huge allocations.  Of course, it'd increase
the cost of pfree() of small allocations, because AllocSetFree() currently
doesn't need to access the block for those. But I'd guess that'd be outweighed
by the reduced memory usage.

Sorry for the somewhat off-topic musing - I was trying to see if the
MemoryContextLink approach could be generalized or has disadvantages aside
from the branch in GetMemoryChunkContext().



> For back-patching into v14, we could put the new NodeTag type at the
> end of that enum list.  The change in the inline GetMemoryChunkContext
> is probably an acceptable hazard.

Why would we backpatch this to 14? I don't think we have any regressions
there?

Greetings,

Andres Freund

Re: PG15 beta1 sort performance regression due to Generation context change

Reply via email to