On Tue, 24 May 2022 at 09:36, Tom Lane <t...@sss.pgh.pa.us> wrote: > > David Rowley <dgrowle...@gmail.com> writes: > > Handy wavy idea: It's probably too complex for now, and it also might > > be too much overhead, but having GenerationPointerGetChunk() do a > > binary search on a sorted-by-memory-address array of block pointers > > might be a fast enough way to find the block that the pointer belongs > > to. There should be far fewer blocks now since generation.c now grows > > the block sizes. N in O(log2 N) the search complexity should never be > > excessively high. > > > However, it would mean a binary search for every pfree, which feels > > pretty horrible. My feeling is that it seems unlikely that saving 8 > > bytes by not storing the GenerationBlock would be a net win here. > > I think probably that could be made to work, since a Generation > context should not contain all that many live blocks at any one time.
I've done a rough cut implementation of this and attached it here. I've not done that much testing yet, but it does seem to fix the performance regression that I mentioned in the blog post that I linked in the initial post on this thread. There are a couple of things to note about the patch: 1. I quickly realised that there's no good place to store the sorted-by-memory-address array of GenerationBlocks. In the patch, I've had to malloc() this array and also had to use a special case so that I didn't try to do another malloc() inside GenerationContextCreate(). It's possible that the malloc() / repalloc of this array fails. In which case, I think I've coded things in such a way that there will be no memory leaks of the newly added block. 2. I did see GenerationFree() pop up in perf top a little more than it used to. I considered that we might want to have GenerationGetBlockFromChunk() cache the last found block for the set and then check that one first. We expect generation contexts to pfree() in an order that would likely make us hit this case most of the time. I added a few lines to the attached v2 patch to add a last_pfree_block field to the context struct. However, this seems to hinder performance more than it helps. It can easily be removed from the v2 patch. In the results you can see the PG14 + PG15 results the same as I reported on the blog post I linked earlier. It seems that for PG15_patched virtually all work_mem sizes produce results that are faster than PG14. The exception here is the 16GB test where PG15_patched is 0.8% slower, which seems within the noise threshold. David
diff --git a/src/backend/utils/mmgr/generation.c b/src/backend/utils/mmgr/generation.c index e530e272e0..b11c0d77a5 100644 --- a/src/backend/utils/mmgr/generation.c +++ b/src/backend/utils/mmgr/generation.c @@ -71,7 +71,11 @@ typedef struct GenerationContext GenerationBlock *freeblock; /* pointer to a block that's being recycled, * or NULL if there's no such block. */ GenerationBlock *keeper; /* keep this block over resets */ - dlist_head blocks; /* list of blocks */ + uint32 nblocks; /* Number of items in 'blocks' array */ + uint32 blocks_size; /* Allocation size of 'blocks' array */ + GenerationBlock **blocks; /* Array of blocks sorted by memory address */ + GenerationBlock *last_pfree_block; /* Pointer to last block pfree was + * called on */ } GenerationContext; /* @@ -88,7 +92,6 @@ typedef struct GenerationContext */ struct GenerationBlock { - dlist_node node; /* doubly-linked list of blocks */ Size blksize; /* allocated size of this block */ int nchunks; /* number of chunks in the block */ int nfree; /* number of free chunks */ @@ -127,7 +130,6 @@ struct GenerationChunk char padding[MAXIMUM_ALIGNOF - GENERATIONCHUNK_RAWSIZE % MAXIMUM_ALIGNOF]; #endif - GenerationBlock *block; /* block owning this chunk */ GenerationContext *context; /* owning context, or NULL if freed chunk */ /* there must not be any padding to reach a MAXALIGN boundary here! */ }; @@ -151,11 +153,17 @@ struct GenerationChunk #define GenerationChunkGetPointer(chk) \ ((GenerationPointer *)(((char *)(chk)) + Generation_CHUNKHDRSZ)) -/* Inlined helper functions */ +/* Helper functions */ static inline void GenerationBlockInit(GenerationBlock *block, Size blksize); static inline bool GenerationBlockIsEmpty(GenerationBlock *block); static inline void GenerationBlockMarkEmpty(GenerationBlock *block); static inline Size GenerationBlockFreeBytes(GenerationBlock *block); +static GenerationBlock *GenerationGetBlockFromChunk(GenerationContext *set, + GenerationChunk *chunk); +static void GenerationTrackBlock(GenerationContext *set, + GenerationBlock *block); +static void GenerationUnTrackBlock(GenerationContext *set, + GenerationBlock *block); static inline void GenerationBlockFree(GenerationContext *set, GenerationBlock *block); @@ -272,7 +280,10 @@ GenerationContextCreate(MemoryContext parent, * Avoid writing code that can fail between here and MemoryContextCreate; * we'd leak the header if we ereport in this stretch. */ - dlist_init(&set->blocks); + set->blocks = NULL; + set->nblocks = 0; + set->blocks_size = 0; + set->last_pfree_block = NULL; /* Fill in the initial block's block header */ block = (GenerationBlock *) (((char *) set) + MAXALIGN(sizeof(GenerationContext))); @@ -280,8 +291,8 @@ GenerationContextCreate(MemoryContext parent, firstBlockSize = allocSize - MAXALIGN(sizeof(GenerationContext)); GenerationBlockInit(block, firstBlockSize); - /* add it to the doubly-linked list of blocks */ - dlist_push_head(&set->blocks, &block->node); + /* add it to the array of blocks */ + GenerationTrackBlock(set, block); /* use it as the current allocation block */ set->block = block; @@ -330,7 +341,8 @@ static void GenerationReset(MemoryContext context) { GenerationContext *set = (GenerationContext *) context; - dlist_mutable_iter miter; + uint32 i; + uint32 nblocks; AssertArg(GenerationIsValid(set)); @@ -346,14 +358,20 @@ GenerationReset(MemoryContext context) */ set->freeblock = NULL; - dlist_foreach_modify(miter, &set->blocks) + for (i = 0, nblocks = set->nblocks; i < nblocks;) { - GenerationBlock *block = dlist_container(GenerationBlock, node, miter.cur); + GenerationBlock *block = set->blocks[i]; if (block == set->keeper) + { GenerationBlockMarkEmpty(block); + i++; + } else + { GenerationBlockFree(set, block); + nblocks--; + } } /* set it so new allocations to make use of the keeper block */ @@ -362,9 +380,8 @@ GenerationReset(MemoryContext context) /* Reset block size allocation sequence, too */ set->nextBlockSize = set->initBlockSize; - /* Ensure there is only 1 item in the dlist */ - Assert(!dlist_is_empty(&set->blocks)); - Assert(!dlist_has_next(&set->blocks, dlist_head_node(&set->blocks))); + /* Ensure there is only 1 item in the blocks array */ + Assert(set->nblocks == 1); } /* @@ -374,8 +391,14 @@ GenerationReset(MemoryContext context) static void GenerationDelete(MemoryContext context) { + GenerationContext *set = (GenerationContext *)context; + /* Reset to release all releasable GenerationBlocks */ GenerationReset(context); + + if (set->blocks != &set->keeper && set->blocks != NULL) + free(set->blocks); + /* And free the context header and keeper block */ free(context); } @@ -422,7 +445,6 @@ GenerationAlloc(MemoryContext context, Size size) block->freeptr = block->endptr = ((char *) block) + blksize; chunk = (GenerationChunk *) (((char *) block) + Generation_BLOCKHDRSZ); - chunk->block = block; chunk->context = set; chunk->size = chunk_size; @@ -437,8 +459,8 @@ GenerationAlloc(MemoryContext context, Size size) randomize_mem((char *) GenerationChunkGetPointer(chunk), size); #endif - /* add the block to the list of allocated blocks */ - dlist_push_head(&set->blocks, &block->node); + /* add the block to the array of allocated blocks */ + GenerationTrackBlock(set, block); /* Ensure any padding bytes are marked NOACCESS. */ VALGRIND_MAKE_MEM_NOACCESS((char *) GenerationChunkGetPointer(chunk) + size, @@ -518,8 +540,8 @@ GenerationAlloc(MemoryContext context, Size size) /* initialize the new block */ GenerationBlockInit(block, blksize); - /* add it to the doubly-linked list of blocks */ - dlist_push_head(&set->blocks, &block->node); + /* add it to the array of blocks */ + GenerationTrackBlock(set, block); /* Zero out the freeblock in case it's become full */ set->freeblock = NULL; @@ -543,7 +565,6 @@ GenerationAlloc(MemoryContext context, Size size) Assert(block->freeptr <= block->endptr); - chunk->block = block; chunk->context = set; chunk->size = chunk_size; @@ -632,6 +653,113 @@ GenerationBlockFreeBytes(GenerationBlock *block) return (block->endptr - block->freeptr); } +#define INIT_BLOCK_ARRAY_SIZE 8 + +/* + * GenerationTrackBlock + * Add 'block' to the 'set' blocks array maintaining it in memory address + * order. + */ +static void +GenerationTrackBlock(GenerationContext *set, GenerationBlock *block) +{ + uint32 i; + + /* + * We must special case the storage of the first block as we're unable to + * malloc() memory during the creation of a memory context. See comments + * in GenerationContextCreate(). + */ + if (set->nblocks == 0) + { + set->blocks = &set->keeper; + set->nblocks = 1; + set->blocks_size = 1; + return; + } + + /* Check for special-case from above */ + if (set->blocks == &set->keeper) + { + set->blocks = malloc(sizeof(GenerationBlock *) * INIT_BLOCK_ARRAY_SIZE); + set->blocks_size = INIT_BLOCK_ARRAY_SIZE; + set->blocks[0] = set->keeper; + set->nblocks = 1; + } + else if (set->nblocks == set->blocks_size) + { + uint32 newsize = set->blocks_size * 2; + + set->blocks = realloc(set->blocks, sizeof(GenerationBlock *) * newsize); + set->blocks_size = newsize; + } + + /* check for malloc() / realloc() failures */ + if (set->blocks == NULL) + { + /* Free this block so that we don't leak it in GenerationDelete() */ + free(block); + + MemoryContextStats(TopMemoryContext); + ereport(ERROR, + (errcode(ERRCODE_OUT_OF_MEMORY), + errmsg("out of memory"), + errdetail("Failed while expanding memory context \"%s\".", + ((MemoryContext) set)->name))); + } + + /* + * Search for the slot in the array where we must add the new block so + * that it remains sorted in block address order + */ + for (i = 0; i < set->nblocks; i++) + { + if ((char *) block < (char *) set->blocks[i]) + break; + } + + /* make a gap in the blocks array to store the new pointer */ + for (uint32 j = set->nblocks; j > i; j--) + set->blocks[j] = set->blocks[j - 1]; + + set->blocks[i] = block; + set->nblocks++; +} + +static void +GenerationUnTrackBlock(GenerationContext *set, GenerationBlock *block) +{ + uint32 low = 0; + uint32 high = set->nblocks; + uint32 mid; + + /* binary search the memory address sorted set->blocks array */ + while (low < high) + { + GenerationBlock *curblock; + + mid = (uint32) (((uint64) low + high) / 2); + + curblock = set->blocks[mid]; + + if (block == curblock) + break; + else if ((char *) block < (char *) curblock) + high = mid; + else + low = mid + 1; + } + + /* fill the hole in the array from the removed block */ + while (mid < set->nblocks) + { + set->blocks[mid] = set->blocks[mid + 1]; + mid++; + } + + set->nblocks--; +} + /* * GenerationBlockFree * Remove 'block' from 'set' and release the memory consumed by it. @@ -644,8 +772,8 @@ GenerationBlockFree(GenerationContext *set, GenerationBlock *block) /* We shouldn't be freeing the freeblock either */ Assert(block != set->freeblock); - /* release the block from the list of blocks */ - dlist_delete(&block->node); + /* release the block from the array of blocks */ + GenerationUnTrackBlock(set, block); ((MemoryContext) set)->mem_allocated -= block->blksize; @@ -656,6 +784,55 @@ GenerationBlockFree(GenerationContext *set, GenerationBlock *block) free(block); } +/* + * GenerationGetBlockFromChunk + * Find the GenerationBlock that 'chunk' belongs to and return it. + */ +static GenerationBlock * +GenerationGetBlockFromChunk(GenerationContext *set, GenerationChunk *chunk) +{ + GenerationBlock *block; + GenerationBlock *lblock = set->last_pfree_block; + uint32 low; + uint32 high; + + /* + * We expect GenerationChunks to be pfree'd in roughly palloc() order, so + * it's likely that the last pfree'd chunk is on the same block as this + * chunk. + */ + if (lblock != NULL && + (char *) chunk >= (char *) lblock && + (char *) chunk <= lblock->endptr) + return lblock; + + low = 0; + high = set->nblocks - 1; + + while (low <= high) + { + uint32 mid = (uint32) (((uint64) low + high) / 2); + + block = set->blocks[mid]; + + if (((char *) block) <= ((char *) chunk)) + { + /* Check if the chunk falls within the used space of this block */ + if (((char *) chunk) <= block->freeptr) + { + low = mid; + break; + } + low = mid + 1; + } + else + high = mid - 1; + } + + set->last_pfree_block = set->blocks[low]; + + return set->blocks[low]; +} /* * GenerationFree * Update number of chunks in the block, and if all chunks in the block @@ -671,7 +848,7 @@ GenerationFree(MemoryContext context, void *pointer) /* Allow access to private part of chunk header. */ VALGRIND_MAKE_MEM_DEFINED(chunk, GENERATIONCHUNK_PRIVATE_LEN); - block = chunk->block; + block = GenerationGetBlockFromChunk(set, chunk); #ifdef MEMORY_CONTEXT_CHECKING /* Test for someone scribbling on unused space in chunk */ @@ -728,10 +905,11 @@ GenerationFree(MemoryContext context, void *pointer) /* * The block is empty, so let's get rid of it. First remove it from the - * list of blocks, then return it to malloc(). + * array of blocks, then return it to malloc(). */ - dlist_delete(&block->node); + GenerationUnTrackBlock(set, block); + set->last_pfree_block = NULL; context->mem_allocated -= block->blksize; free(block); } @@ -878,11 +1056,11 @@ static bool GenerationIsEmpty(MemoryContext context) { GenerationContext *set = (GenerationContext *) context; - dlist_iter iter; + uint32 i; - dlist_foreach(iter, &set->blocks) + for (i = 0; i < set->nblocks; i++) { - GenerationBlock *block = dlist_container(GenerationBlock, node, iter.cur); + GenerationBlock *block = set->blocks[i]; if (block->nchunks > 0) return false; @@ -914,14 +1092,14 @@ GenerationStats(MemoryContext context, Size nfreechunks = 0; Size totalspace; Size freespace = 0; - dlist_iter iter; + uint32 i; /* Include context header in totalspace */ totalspace = MAXALIGN(sizeof(GenerationContext)); - dlist_foreach(iter, &set->blocks) + for (i = 0; i < set->nblocks; i++) { - GenerationBlock *block = dlist_container(GenerationBlock, node, iter.cur); + GenerationBlock *block = set->blocks[i]; nblocks++; nchunks += block->nchunks; @@ -966,13 +1144,13 @@ GenerationCheck(MemoryContext context) { GenerationContext *gen = (GenerationContext *) context; const char *name = context->name; - dlist_iter iter; Size total_allocated = 0; + uint32 i; /* walk all blocks in this context */ - dlist_foreach(iter, &gen->blocks) + for (i = 0; i < gen->nblocks; i++) { - GenerationBlock *block = dlist_container(GenerationBlock, node, iter.cur); + GenerationBlock *block = gen->blocks[i]; int nfree, nchunks; char *ptr; @@ -1004,8 +1182,8 @@ GenerationCheck(MemoryContext context) nchunks += 1; - /* chunks have both block and context pointers, so check both */ - if (chunk->block != block) + /* ensure the found GenerationBlock matches block */ + if (GenerationGetBlockFromChunk(gen, chunk) != block) elog(WARNING, "problem in Generation %s: bogus block link in block %p, chunk %p", name, block, chunk);