Hi Ashutosh,
> > * During resize, simply calculate the new size and call ftruncate on each
> > segment to adjust memory accordingly, no need to mmap/munmap or modify any
> > memory mapping.
> >
> >
> That's same as my understanding.
Great, thanks for confirming!
> I thought I had shared a test pr
On Wed, May 7, 2025 at 11:04 AM Jack Ng wrote:
> > all the possible scenarios. But now I'm reworking it along the lines
> suggested
> > by Thomas, and will address those as well. Thanks!
>
> Thanks for the info, Dmitry.
> Just want to confirm my understanding of Thomas' suggestion and your
> disc
> all the possible scenarios. But now I'm reworking it along the lines suggested
> by Thomas, and will address those as well. Thanks!
Thanks for the info, Dmitry.
Just want to confirm my understanding of Thomas' suggestion and your
discussions... I think the simpler and more portable solution goe
> On Tue, May 06, 2025 at 04:23:07AM GMT, Jack Ng wrote:
> Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected
> in various tests (sorry, I misunderstood some details initially).
Great, thanks for checking.
> I also want to report a couple of minor issues found during t
n SHMEM_RESIZE_RATIO to ensure the
reserved space of those segments are sufficient.
Regards,
Jack Ng
-Original Message-
From: Dmitry Dolgov <9erthali...@gmail.com>
Sent: Monday, April 21, 2025 5:33 AM
To: Ni Ku
Cc: Ashutosh Bapat ;
pgsql-hack...@postgresql.org; Robert Haas
On Mon, Apr 21, 2025 at 9:30 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> Yeah, that would work and will allow to avoid MAP_FIXED and mremap, which are
> questionable from portability point of view. This leaves memfd_create, and I'm
> still not completely clear on it's portability -- it seems
> On Thu, Apr 17, 2025 at 07:05:36PM GMT, Ni Ku wrote:
> I also have a related question about how ftruncate() is used in the patch.
> In my testing I also see that when using ftruncate to shrink a shared
> segment, the memory is freed immediately after the call, even if other
> processes still have
> On Fri, Apr 18, 2025 at 10:06:23AM GMT, Konstantin Knizhnik wrote:
> The only drawback is that we are loosing content of shared buffers in case
> of resize. It may be sadly, but not looks like there is no better
> alternative.
No, why would we loose the content? If we do mremap, it will leave th
> On Fri, Apr 18, 2025 at 09:17:21PM GMT, Thomas Munro wrote:
> I was imagining that you might map some maximum possible size at the
> beginning to reserve the address space permanently, and then adjust
> the virtual memory object's size with ftruncate as required to provide
> backing. Doesn't tha
On Thu, Nov 21, 2024 at 8:55 PM Peter Eisentraut wrote:
> On 19.11.24 14:29, Dmitry Dolgov wrote:
> >> I see that memfd_create() has a MFD_HUGETLB flag. It's not very clear how
> >> that interacts with the MAP_HUGETLB flag for mmap(). Do you need to
> >> specify
> >> both of them if you want hu
On 25/02/2025 11:52 am, Dmitry Dolgov wrote:
On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote:
TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
changing shared memory mapping layout. Any feedback is appreciated.
Hi Dmitry,
I am sorry that I have not participa
Hi Ashutosh / Dmitry,
Thanks for the information and discussions, it's been very helpful.
I also have a related question about how ftruncate() is used in the patch.
In my testing I also see that when using ftruncate to shrink a shared
segment, the memory is freed immediately after the call, even
Hi,
On April 18, 2025 11:17:21 AM GMT+02:00, Thomas Munro
wrote:
> Doesn't that achieve the goal with fewer steps, using only
>portable* POSIX stuff, and keeping all pointers stable? I understand
>that pointer stability may not be required (I can see roughly how that
>argument is constructed),
On Fri, Apr 18, 2025 at 9:17 PM Thomas Munro wrote:
> On Fri, Apr 18, 2025 at 7:25 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > Thanks for sharing. I need to do more thorough tests, but after a quick
> > look I'm not sure about that. ftruncate will take care about the memory,
> > but AFAICT
On Fri, Apr 18, 2025 at 7:25 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > On Thu, Apr 17, 2025 at 03:22:28PM GMT, Ashutosh Bapat wrote:
> >
> > In an offlist chat Thomas Munro mentioned that just ftruncate() would
> > be enough to resize the shared memory without touching address maps
> > us
> On Thu, Apr 17, 2025 at 02:21:07PM GMT, Konstantin Knizhnik wrote:
>
> 1. Performance of Postgres CLOCK page eviction algorithm depends on number
> of shared buffers. My first native attempt just to mark unused buffers as
> invalid cause significant degrade of performance
Thanks for sharing!
Ri
> On Thu, Apr 17, 2025 at 03:22:28PM GMT, Ashutosh Bapat wrote:
>
> In an offlist chat Thomas Munro mentioned that just ftruncate() would
> be enough to resize the shared memory without touching address maps
> using mmap and munmap().
>
> ftruncate man page seems to concur with him
>
>If th
On 18/04/2025 12:26 am, Dmitry Dolgov wrote:
On Thu, Apr 17, 2025 at 02:21:07PM GMT, Konstantin Knizhnik wrote:
1. Performance of Postgres CLOCK page eviction algorithm depends on number
of shared buffers. My first native attempt just to mark unused buffers as
invalid cause significant degrade
On Fri, Apr 18, 2025 at 3:54 AM Thomas Munro wrote:
> I contemplated that once before, when I wrote a quick demo patch[1] to
> implement huge_pages=on for FreeBSD (ie explicit rather than
> transparent). I used a different function, not the Linuxoid one but
Oops, I forgot to supply that link[1].
Hi Dmitry,
On Mon, Apr 14, 2025 at 12:50 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Mon, Apr 14, 2025 at 10:40:28AM GMT, Ashutosh Bapat wrote:
> >
> > However, when we put back the patches to shrink buffers, we will evict
> > the extra buffers, and shrink - if all the processes haven
On Mon, Apr 14, 2025 at 12:50 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Mon, Apr 14, 2025 at 10:40:28AM GMT, Ashutosh Bapat wrote:
> >
> > However, when we put back the patches to shrink buffers, we will evict
> > the extra buffers, and shrink - if all the processes haven't
> > parti
> On Mon, Apr 14, 2025 at 10:40:28AM GMT, Ashutosh Bapat wrote:
>
> However, when we put back the patches to shrink buffers, we will evict
> the extra buffers, and shrink - if all the processes haven't
> participated in the barrier by then, some of them may try to access
> those buffers - re-instal
On Fri, Apr 11, 2025 at 8:31 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > > I think a relatively elegant solution is to extend ProcSignalBarrier
> > > mechanism to track not only pss_barrierGeneration, as a sign that
> > > everything was processed, but also something like
> > > pss_barrier
> On Fri, Apr 11, 2025 at 08:04:39PM GMT, Ashutosh Bapat wrote:
> On Mon, Apr 7, 2025 at 2:13 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> >
> > Yes, you're right, plain dynamic Barrier does not ensure all available
> > processes will be synchronized. I was aware about the scenario you
> > des
On Mon, Apr 7, 2025 at 2:13 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> Yes, you're right, plain dynamic Barrier does not ensure all available
> processes will be synchronized. I was aware about the scenario you
> describe, it's mentioned in commentaries for the resize function. I was
> und
> On Wed, Apr 09, 2025 at 01:20:16PM GMT, Ashutosh Bapat wrote:
> ../../coderoot/pg/src/include/storage/s_lock.h:93:2: error: #error
> "s_lock.h may not be included from frontend code"
>
> How about this? Why is that happening?
The same -- as you can see it comes from compiling pg_numa.c, which as
On Wed, Apr 9, 2025 at 1:15 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Wed, Apr 09, 2025 at 11:12:18AM GMT, Ashutosh Bapat wrote:
> > On Mon, Apr 7, 2025 at 2:13 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > >
> > > In the new v4 version
> > > of the patch the first option is im
> On Wed, Apr 09, 2025 at 11:12:18AM GMT, Ashutosh Bapat wrote:
> On Mon, Apr 7, 2025 at 2:13 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> >
> > In the new v4 version
> > of the patch the first option is implemented.
> >
>
> The patches don't apply cleanly using git am but patch -p1 applies
>
On Mon, Apr 7, 2025 at 2:13 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> In the new v4 version
> of the patch the first option is implemented.
>
The patches don't apply cleanly using git am but patch -p1 applies
them cleanly. However I see following compilation errors
RuntimeError: command
On Fri, Feb 28, 2025 at 5:31 PM Ashutosh Bapat
wrote:
>
> I think we should add a way to monitor the progress of resizing; at
> least whether resizing is complete and whether the new GUC value is in
> effect.
>
I further tested this approach by tracing the barrier synchronization
using the attach
> On Thu, Mar 20, 2025 at 04:55:47PM GMT, Ni Ku wrote:
>
> I ran some simple tests (outside of PG) on linux kernel v6.1, which has
> this commit that added some hugepage support to mremap (
> https://patchwork.kernel.org/project/linux-mm/patch/20211013195825.3058275-1-almasrym...@google.com/
> ).
>
Thanks for your insights and confirmation, Dmitry.
Right, I think the anonymous fd approach would work to keep the memory
contents intact in between munmap and mmap with the new size, so bufferpool
expansion would work.
But it seems shrinking would still be problematic, since that approach
requires
You're right Dmitry, truncating the anonymous file before mapping it again
does the trick! I see 'HugePages_Free' increases to the expected size right
after the ftruncate call for shrinking.
This alternative approach looks very promising. Thanks.
Regards,
Jack Ng
On Fri, Mar 21, 2025 at 5:31 PM
> On Fri, Mar 21, 2025 at 04:48:30PM GMT, Ni Ku wrote:
> Thanks for your insights and confirmation, Dmitry.
> Right, I think the anonymous fd approach would work to keep the memory
> contents intact in between munmap and mmap with the new size, so bufferpool
> expansion would work.
> But it seems s
Dmitry / Ashutosh,
Thanks for the patch set. I've been doing some testing with it and in
particular want to see if this solution would work with hugepage bufferpool.
I ran some simple tests (outside of PG) on linux kernel v6.1, which has
this commit that added some hugepage support to mremap (
htt
On Tue, Feb 25, 2025 at 3:22 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote:
> > TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
> > changing shared memory mapping layout. Any feedback is appreciated.
>
> Hi,
> On Tue, Feb 25, 2025 at 10:52:05AM GMT, Dmitry Dolgov wrote:
> > On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote:
> > TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
> > changing shared memory mapping layout. Any feedback is appreciated.
>
> Hi,
>
> Here is a n
> On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote:
> TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
> changing shared memory mapping layout. Any feedback is appreciated.
Hi,
Here is a new version of the patch, which contains a proposal about how to
coordinate
Hi Dmitry,
On Tue, Dec 17, 2024 at 7:40 PM Ashutosh Bapat
wrote:
>
> I could verify the memory mappings, their sizes etc. by looking at
> /proc/PID/maps and /proc/PID/status but I did not find a way to verify
> the amount of memory actually allocated and verify that it's actually
> shrinking and
On Tue, Dec 3, 2024 at 8:01 PM Robert Haas wrote:
>
> On Mon, Dec 2, 2024 at 2:18 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > I've asked about that in linux-mm [1]. To my surprise, the
> > recommendations were to stick to creating a large mapping in advance,
> > and slice smaller mappings
On Mon, Dec 2, 2024 at 2:18 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> I've asked about that in linux-mm [1]. To my surprise, the
> recommendations were to stick to creating a large mapping in advance,
> and slice smaller mappings out of that, which could be resized later.
> The OOM score sh
> On Fri, Nov 29, 2024 at 05:47:27PM GMT, Dmitry Dolgov wrote:
> > On Fri, Nov 29, 2024 at 01:56:30AM GMT, Matthias van de Meent wrote:
> >
> > I mean, we can do the following to get a nice contiguous empty address
> > space no other mmap(NULL)s will get put into:
> >
> > /* reserve size bytes
Hi,
On 2024-11-28 17:30:32 +0100, Dmitry Dolgov wrote:
> The assumption about picking up a lowest address is just how it works right
> now
> on Linux, this fact is already used in the patch. The idea that we could put
> upper boundary on the size of other mappings based on total available memory
> On Fri, Nov 29, 2024 at 01:56:30AM GMT, Matthias van de Meent wrote:
>
> I mean, we can do the following to get a nice contiguous empty address
> space no other mmap(NULL)s will get put into:
>
> /* reserve size bytes of memory */
> base = mmap(NULL, size, PROT_NONE, ...flags, ...);
>
Matthias van de Meent writes:
> I mean, we can do the following to get a nice contiguous empty address
> space no other mmap(NULL)s will get put into:
> /* reserve size bytes of memory */
> base = mmap(NULL, size, PROT_NONE, ...flags, ...);
> /* use the first small_size bytes of that
On Thu, 28 Nov 2024 at 19:57, Tom Lane wrote:
>
> Matthias van de Meent writes:
> > On Thu, 28 Nov 2024 at 18:19, Robert Haas wrote:
> >> [...] It's unclear to me why
> >> operating systems don't offer better primitives for this sort of thing
> >> -- in theory there could be a system call that s
Matthias van de Meent writes:
> On Thu, 28 Nov 2024 at 18:19, Robert Haas wrote:
>> [...] It's unclear to me why
>> operating systems don't offer better primitives for this sort of thing
>> -- in theory there could be a system call that sets aside a pool of
>> address space and then other system
> On Thu, Nov 28, 2024 at 12:18:54PM GMT, Robert Haas wrote:
>
> All that having been said, what does concern me a bit is our ability
> to predict what Linux will do well enough to keep what we're doing
> safe; and also whether the Linux behavior might abruptly change in the
> future. Users would b
On Thu, 28 Nov 2024 at 18:19, Robert Haas wrote:
>
> [...] It's unclear to me why
> operating systems don't offer better primitives for this sort of thing
> -- in theory there could be a system call that sets aside a pool of
> address space and then other system calls that let you allocate
> share
On Thu, Nov 28, 2024 at 11:30 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> on Linux, this fact is already used in the patch. The idea that we could put
> upper boundary on the size of other mappings based on total available memory
> comes from the fact that anonymous mappings, that are much la
> On Wed, Nov 27, 2024 at 04:05:47PM GMT, Robert Haas wrote:
> On Wed, Nov 27, 2024 at 3:48 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > My understanding is that clashing of mappings (either at creation time
> > or when resizing) could happen only withing the process address space,
> > and t
On Wed, Nov 27, 2024 at 4:28 PM Jelte Fennema-Nio wrote:
> On Wed, 27 Nov 2024 at 22:06, Robert Haas wrote:
> > If we had an upper bound on the size of shared_buffers
>
> I think a fairly reliable upper bound is the amount of physical memory
> on the system at time of postmaster start. We could m
On Wed, Nov 27, 2024 at 4:41 PM Andres Freund wrote:
> Strictly speaking we don't actually need to map shared buffers to the same
> location in each process... We do need that for most other uses of shared
> memory, including the buffer mapping table, but not for the buffer data
> itself.
Well, i
Hi,
On 2024-11-27 16:05:47 -0500, Robert Haas wrote:
> On Wed, Nov 27, 2024 at 3:48 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > My understanding is that clashing of mappings (either at creation time
> > or when resizing) could happen only withing the process address space,
> > and the assu
On Wed, 27 Nov 2024 at 22:06, Robert Haas wrote:
> If we had an upper bound on the size of shared_buffers
I think a fairly reliable upper bound is the amount of physical memory
on the system at time of postmaster start. We could make it a GUC to
set the upper bound for the rare cases where people
On Wed, Nov 27, 2024 at 3:48 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> My understanding is that clashing of mappings (either at creation time
> or when resizing) could happen only withing the process address space,
> and the assumption is that by the time we prepare the mapping layout all
>
> On Wed, Nov 27, 2024 at 10:20:27AM GMT, Robert Haas wrote:
> > >
> > > code, but I'm not sure exactly which points are safe. If we have no
> > > code anywhere that assumes the address of an unpinned buffer can't
> > > change before we pin it, then I guess the check for pins is the only
> > > thin
On Tue, Nov 26, 2024 at 2:18 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> I haven't had a chance to experiment with that on Windows, but I'm
> hoping that in the worst case fallback to a single mapping via proposed
> infrastructure (and the consequent limitations) would be acceptable.
Yeah, i
> On Mon, Nov 25, 2024 at 02:33:48PM GMT, Robert Haas wrote:
>
> I think the idea of having multiple shared memory segments is
> interesting and makes sense, but I would prefer to see them called
> "segments" rather than "slots" just as do we do for DSMs. The name
> "slot" is somewhat overused, and
On 19.11.24 14:29, Dmitry Dolgov wrote:
I noticed the existing code made inconsistent use of PGShmemHeader * vs.
void *, which also bled into your patch. I made the attached little patch
to clean that up a bit.
Right, it was bothering me the whole time, but not strong enough to make
me fix this
On Fri, Oct 18, 2024 at 3:21 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
> changing shared memory mapping layout. Any feedback is appreciated.
A lot of people would like to have this feature, so I hope this
proposal work
On 19.11.24 14:29, Dmitry Dolgov wrote:
I see that memfd_create() has a MFD_HUGETLB flag. It's not very clear how
that interacts with the MAP_HUGETLB flag for mmap(). Do you need to specify
both of them if you want huge pages?
Correct, both (one flag in memfd_create and one for mmap) are neede
> On Tue, Nov 19, 2024 at 01:57:00PM GMT, Peter Eisentraut wrote:
> On 18.10.24 21:21, Dmitry Dolgov wrote:
> > v1-0001-Allow-to-use-multiple-shared-memory-mappings.patch
> >
> > Preparation, introduces the possibility to work with many shmem mappings. To
> > make it less invasive, I've duplicated
On 18.10.24 21:21, Dmitry Dolgov wrote:
v1-0001-Allow-to-use-multiple-shared-memory-mappings.patch
Preparation, introduces the possibility to work with many shmem mappings. To
make it less invasive, I've duplicated the shmem API to extend it with the
shmem_slot argument, while redirecting the or
> On Wed, Nov 06, 2024 at 07:10:06PM GMT, Vladlen Popolitov wrote:
> Hi
>
> I tried to apply patches, but failed. I suppose the problem with CRLF in the
> end of lines in the patch files. At least, after manual change of v1-0001 and
> v1-0002 from CRLF to LF patches applied, but it was not helped
> On Thu, Nov 07, 2024 at 02:05:52PM GMT, Thomas Munro wrote:
> On Sat, Oct 19, 2024 at 8:21 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > Currently it
> > supports only an increase of shared_buffers.
>
> Just BTW in case it is interesting, Palak and I experimented with how
> to shrink the bu
On Sat, Oct 19, 2024 at 8:21 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> Currently it
> supports only an increase of shared_buffers.
Just BTW in case it is interesting, Palak and I experimented with how
to shrink the buffer pool while PostgreSQL is running, while we were
talking about 13453e
Hi
I tried to apply patches, but failed. I suppose the problem with CRLF in the
end of lines in the patch files. At least, after manual change of v1-0001 and
v1-0002 from CRLF to LF patches applied, but it was not helped for v1-0003 -
v1.0005 - they have also other mistakes during patch process
> On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote:
>
> TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
> changing shared memory mapping layout. Any feedback is appreciated.
It was pointed out to me, that earlier this year there was a useful
discussion about simi
69 matches
Mail list logo