Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected in various tests (sorry, I misunderstood some details initially).
I also want to report a couple of minor issues found during testing (which you may be aware of already): 1. For memory segments other the first one ('main'), the start address passed to mmap may not be aligned to 4KB or huge page size (since reserved_offset may not be aligned) and cause mmap to fail. 2. Since the ratio for main/desc/iocv/checkpt/strategy in SHMEM_RESIZE_RATIO are relatively small, I think we need to guard against the case where 'max_available_memory' is too small for the required sizes of these segments (from CalculateShmemSize). Like when max_available_memory=default and shared_numbers=128kB, 'main' still needs ~109MB, but since only 10% of max_available_memory is reserved for it (~102MB) and start address of the next segment is calculated based on reserved_offset, this would cause the mappings to overlap and memory problems later (I hit this after fixing 1.) I suppose we can change the minimum value of max_available_memory to be large enough, and may also adjust the ratios in SHMEM_RESIZE_RATIO to ensure the reserved space of those segments are sufficient. Regards, Jack Ng -----Original Message----- From: Dmitry Dolgov <9erthali...@gmail.com> Sent: Monday, April 21, 2025 5:33 AM To: Ni Ku <jakkun...@gmail.com> Cc: Ashutosh Bapat <ashutosh.bapat....@gmail.com>; pgsql-hack...@postgresql.org; Robert Haas <robertmh...@gmail.com> Subject: Re: Changing shared_buffers without restart > On Thu, Apr 17, 2025 at 07:05:36PM GMT, Ni Ku wrote: > I also have a related question about how ftruncate() is used in the patch. > In my testing I also see that when using ftruncate to shrink a shared > segment, the memory is freed immediately after the call, even if other > processes still have that memory mapped, and they will hit SIGBUS if > they try to access that memory again as the manpage says. > > So am I correct to think that, to support the bufferpool shrinking > case, it would not be safe to call ftruncate in AnonymousShmemResize > as-is, since at that point other processes may still be using pages > that belong to the truncated memory? > It appears that for shrinking we should only call ftruncate when we're > sure no process will access those pages again (eg, all processes have > handled the resize interrupt signal barrier). I suppose this can be > done by the resize coordinator after synchronizing with all the other > processes. > But in that case it seems we cannot use the postmaster as the > coordinator then? b/c I see some code comments saying the postmaster > does not have waiting infrastructure... (maybe even if the postmaster > has waiting infra we don't want to use it anyway since it can be > blocked for a long time and won't be able to serve other requests). There is already a coordination infrastructure, implemented in the patch 0006, which will take care of this and prevent access to the shared memory until everything is resized.