Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected 
in various tests (sorry, I misunderstood some details initially).

I also want to report a couple of minor issues found during testing (which you 
may be aware of already):

1. For memory segments other the first one ('main'), the start address passed 
to mmap may not be aligned to 4KB or huge page size (since reserved_offset may 
not be aligned) and cause mmap to fail.

2. Since the ratio for main/desc/iocv/checkpt/strategy in SHMEM_RESIZE_RATIO  
are relatively small, I think we need to guard against the case where 
'max_available_memory' is too small for the required sizes of these segments 
(from CalculateShmemSize).
Like when max_available_memory=default and shared_numbers=128kB, 'main' still 
needs ~109MB, but since only 10% of max_available_memory is reserved for it 
(~102MB) and start address of the next segment is calculated based on 
reserved_offset, this would cause the mappings to overlap and memory problems 
later (I hit this after fixing 1.)
I suppose we can change the minimum value of max_available_memory to be large 
enough, and may also adjust the ratios in SHMEM_RESIZE_RATIO to ensure the 
reserved space of those segments are sufficient.

Regards,

Jack Ng

-----Original Message-----
From: Dmitry Dolgov <9erthali...@gmail.com> 
Sent: Monday, April 21, 2025 5:33 AM
To: Ni Ku <jakkun...@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat....@gmail.com>; 
pgsql-hack...@postgresql.org; Robert Haas <robertmh...@gmail.com>
Subject: Re: Changing shared_buffers without restart

> On Thu, Apr 17, 2025 at 07:05:36PM GMT, Ni Ku wrote:
> I also have a related question about how ftruncate() is used in the patch.
> In my testing I also see that when using ftruncate to shrink a shared 
> segment, the memory is freed immediately after the call, even if other 
> processes still have that memory mapped, and they will hit SIGBUS if 
> they try to access that memory again as the manpage says.
>
> So am I correct to think that, to support the bufferpool shrinking 
> case, it would not be safe to call ftruncate in AnonymousShmemResize 
> as-is, since at that point other processes may still be using pages 
> that belong to the truncated memory?
> It appears that for shrinking we should only call ftruncate when we're 
> sure no process will access those pages again (eg, all processes have 
> handled the resize interrupt signal barrier). I suppose this can be 
> done by the resize coordinator after synchronizing with all the other 
> processes.
> But in that case it seems we cannot use the postmaster as the 
> coordinator then? b/c I see some code comments saying the postmaster 
> does not have waiting infrastructure... (maybe even if the postmaster 
> has waiting infra we don't want to use it anyway since it can be 
> blocked for a long time and won't be able to serve other requests).

There is already a coordination infrastructure, implemented in the patch 0006, 
which will take care of this and prevent access to the shared memory until 
everything is resized.






Reply via email to