Re: Adding basic NUMA awareness

2025-07-11 Thread Burd, Greg
> On Jul 10, 2025, at 8:13 AM, Burd, Greg wrote: > > >> On Jul 9, 2025, at 1:23 PM, Andres Freund wrote: >> >> Hi, >> >> On 2025-07-09 12:55:51 -0400, Greg Burd wrote: >>> On Jul 9 2025, at 12:35 pm, Andres Freund wrote: >>> FWIW, I've started to wonder if we shouldn't just get rid

Re: Adding basic NUMA awareness

2025-07-11 Thread Andres Freund
Hi, On 2025-07-10 14:17:21 +, Bertrand Drouvot wrote: > On Wed, Jul 09, 2025 at 03:42:26PM -0400, Andres Freund wrote: > > I wonder if we should *increase* the size of shared_buffers whenever huge > > pages are in use and there's padding space due to the huge page > > boundaries. Pretty pointl

Re: Adding basic NUMA awareness

2025-07-11 Thread Andres Freund
Hi, On 2025-07-10 17:31:45 +0200, Tomas Vondra wrote: > On 7/9/25 19:23, Andres Freund wrote: > > There's other things around this that could use some attention. It's not > > hard > > to see clock sweep be a bottleneck in concurrent workloads - partially due > > to > > the shared maintenance of

Re: Adding basic NUMA awareness

2025-07-10 Thread Tomas Vondra
On 7/9/25 19:23, Andres Freund wrote: > Hi, > > On 2025-07-09 12:55:51 -0400, Greg Burd wrote: >> On Jul 9 2025, at 12:35 pm, Andres Freund wrote: >> >>> FWIW, I've started to wonder if we shouldn't just get rid of the freelist >>> entirely. While clocksweep is perhaps minutely slower in a sin

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-10 Thread Tomas Vondra
On 7/9/25 08:40, Cédric Villemain wrote: >> On 7/8/25 18:06, Cédric Villemain wrote: >>> >>> >>> >>> >>> >>> On 7/8/25 03:55, Cédric Villemain wrote: > Hi Andres, > >> Hi, >> >> On 2025-07-05 07:09:00 +, Cédric Villemain wrote: >>> In my work on more careful Post

Re: Adding basic NUMA awareness

2025-07-10 Thread Bertrand Drouvot
Hi, On Wed, Jul 09, 2025 at 03:42:26PM -0400, Andres Freund wrote: > Hi, > > Thanks for working on this! Indeed, thanks! > On 2025-07-01 21:07:00 +0200, Tomas Vondra wrote: > > 1) v1-0001-NUMA-interleaving-buffers.patch > > > > This is the main thing when people think about NUMA - making sure t

Re: Adding basic NUMA awareness

2025-07-10 Thread Burd, Greg
> On Jul 9, 2025, at 1:23 PM, Andres Freund wrote: > > Hi, > > On 2025-07-09 12:55:51 -0400, Greg Burd wrote: >> On Jul 9 2025, at 12:35 pm, Andres Freund wrote: >> >>> FWIW, I've started to wonder if we shouldn't just get rid of the freelist >>> entirely. While clocksweep is perhaps minute

Re: Adding basic NUMA awareness

2025-07-10 Thread Jakub Wartak
On Wed, Jul 9, 2025 at 9:42 PM Andres Freund wrote: > On 2025-07-01 21:07:00 +0200, Tomas Vondra wrote: > > Each patch has a numa_ GUC, intended to enable/disable that part. This > > is meant to make development easier, not as a final interface. I'm not > > sure how exactly that should look. It's

Re: Adding basic NUMA awareness

2025-07-10 Thread Jakub Wartak
On Wed, Jul 9, 2025 at 7:13 PM Andres Freund wrote: > > Yes, and we are discussing if it is worth getting into smaller pages > > for such usecases (e.g. 4kB ones without hugetlb with 2MB hugepages or > > what more even more waste 1GB hugetlb if we dont request 2MB for some > > small structs: btw,

Re: Adding basic NUMA awareness

2025-07-09 Thread Andres Freund
Hi, Thanks for working on this! I think it's an area we have long neglected... On 2025-07-01 21:07:00 +0200, Tomas Vondra wrote: > Each patch has a numa_ GUC, intended to enable/disable that part. This > is meant to make development easier, not as a final interface. I'm not > sure how exactly t

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-09 Thread Andres Freund
Hi, On 2025-07-08 16:06:00 +, Cédric Villemain wrote: > > Assuming we want to actually pin tasks from within Postgres, what I > > think might work is allowing modules to "advise" on where to place the > > task. But the decision would still be done by core. > > Possibly exactly what you're doi

Re: Adding basic NUMA awareness

2025-07-09 Thread Andres Freund
Hi, On 2025-07-09 12:55:51 -0400, Greg Burd wrote: > On Jul 9 2025, at 12:35 pm, Andres Freund wrote: > > > FWIW, I've started to wonder if we shouldn't just get rid of the freelist > > entirely. While clocksweep is perhaps minutely slower in a single > > thread than > > the freelist, clock sweep

Re: Adding basic NUMA awareness

2025-07-09 Thread Andres Freund
Hi, On 2025-07-09 12:04:00 +0200, Jakub Wartak wrote: > On Tue, Jul 8, 2025 at 2:56 PM Andres Freund wrote: > > On 2025-07-08 14:27:12 +0200, Tomas Vondra wrote: > > > On 7/8/25 05:04, Andres Freund wrote: > > > > On 2025-07-04 13:05:05 +0200, Jakub Wartak wrote: > > > > The reason it would be ad

Re: Adding basic NUMA awareness

2025-07-09 Thread Greg Burd
On Jul 9 2025, at 12:35 pm, Andres Freund wrote: > FWIW, I've started to wonder if we shouldn't just get rid of the freelist > entirely. While clocksweep is perhaps minutely slower in a single > thread than > the freelist, clock sweep scales *considerably* better [1]. As it's rather > rare to

Re: Adding basic NUMA awareness

2025-07-09 Thread Andres Freund
Hi, On 2025-07-02 14:36:31 +0200, Tomas Vondra wrote: > On 7/2/25 13:37, Ashutosh Bapat wrote: > > On Wed, Jul 2, 2025 at 12:37 AM Tomas Vondra wrote: > >> > >> > >> 3) v1-0003-freelist-Don-t-track-tail-of-a-freelist.patch > >> > >> Minor optimization. Andres noticed we're tracking the tail of bu

Re: Adding basic NUMA awareness

2025-07-09 Thread Jakub Wartak
On Tue, Jul 8, 2025 at 2:56 PM Andres Freund wrote: > > Hi, > > On 2025-07-08 14:27:12 +0200, Tomas Vondra wrote: > > On 7/8/25 05:04, Andres Freund wrote: > > > On 2025-07-04 13:05:05 +0200, Jakub Wartak wrote: > > > The reason it would be advantageous to put something like the procarray > > > o

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-09 Thread Bertrand Drouvot
Hi, On Wed, Jul 09, 2025 at 06:40:00AM +, Cédric Villemain wrote: > > On 7/8/25 18:06, Cédric Villemain wrote: > > I'm not against making this extensible, in some way. But I still > > struggle to imagine a reasonable alternative policy, where the external > > module gets the same information a

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-08 Thread Cédric Villemain
On 7/8/25 18:06, Cédric Villemain wrote: On 7/8/25 03:55, Cédric Villemain wrote: Hi Andres, Hi, On 2025-07-05 07:09:00 +, Cédric Villemain wrote: In my work on more careful PostgreSQL resource management, I've come to the conclusion that we should avoid pushing policy too deeply

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-08 Thread Tomas Vondra
On 7/8/25 18:06, Cédric Villemain wrote: > > > > > > >> On 7/8/25 03:55, Cédric Villemain wrote: >>> Hi Andres, >>> Hi, On 2025-07-05 07:09:00 +, Cédric Villemain wrote: > In my work on more careful PostgreSQL resource management, I've come > to the > conclus

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-08 Thread Cédric Villemain
On 7/8/25 03:55, Cédric Villemain wrote: Hi Andres, Hi, On 2025-07-05 07:09:00 +, Cédric Villemain wrote: In my work on more careful PostgreSQL resource management, I've come to the conclusion that we should avoid pushing policy too deeply into the PostgreSQL core itself. Therefor

Re: Adding basic NUMA awareness

2025-07-08 Thread Andres Freund
Hi, On 2025-07-08 14:27:12 +0200, Tomas Vondra wrote: > On 7/8/25 05:04, Andres Freund wrote: > > On 2025-07-04 13:05:05 +0200, Jakub Wartak wrote: > > The reason it would be advantageous to put something like the procarray onto > > smaller pages is that otherwise the entire procarray (unless part

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-08 Thread Tomas Vondra
On 7/8/25 03:55, Cédric Villemain wrote: > Hi Andres, > >> Hi, >> >> On 2025-07-05 07:09:00 +, Cédric Villemain wrote: >>> In my work on more careful PostgreSQL resource management, I've come >>> to the >>> conclusion that we should avoid pushing policy too deeply into the >>> PostgreSQL core

Re: Adding basic NUMA awareness

2025-07-08 Thread Tomas Vondra
On 7/8/25 05:04, Andres Freund wrote: > Hi, > > On 2025-07-04 13:05:05 +0200, Jakub Wartak wrote: >> On Tue, Jul 1, 2025 at 9:07 PM Tomas Vondra wrote: >>> I don't think the splitting would actually make some things simpler, or >>> maybe more flexible - in particular, it'd allow us to enable huge

Re: Adding basic NUMA awareness

2025-07-07 Thread Andres Freund
Hi, On 2025-07-04 13:05:05 +0200, Jakub Wartak wrote: > On Tue, Jul 1, 2025 at 9:07 PM Tomas Vondra wrote: > > I don't think the splitting would actually make some things simpler, or > > maybe more flexible - in particular, it'd allow us to enable huge pages > > only for some regions (like shared

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Cédric Villemain
On 7/7/25 16:51, Cédric Villemain wrote: * Others might use it to integrate PostgreSQL's own resources (e.g., "areas" of shared buffers) into policies. Hope this perspective is helpful. Can you explain how you want to manage this by an extension defined at the SQL level, when most of t

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Cédric Villemain
Hi Andres, Hi, On 2025-07-05 07:09:00 +, Cédric Villemain wrote: In my work on more careful PostgreSQL resource management, I've come to the conclusion that we should avoid pushing policy too deeply into the PostgreSQL core itself. Therefore, I'm quite skeptical about integrating NUMA-spec

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Cédric Villemain
On 7/7/25 16:51, Cédric Villemain wrote: * Others might use it to integrate PostgreSQL's own resources (e.g., "areas" of shared buffers) into policies. Hope this perspective is helpful. Can you explain how you want to manage this by an extension defined at the SQL level, when most of this stuf

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Andres Freund
Hi, On 2025-07-05 07:09:00 +, Cédric Villemain wrote: > In my work on more careful PostgreSQL resource management, I've come to the > conclusion that we should avoid pushing policy too deeply into the > PostgreSQL core itself. Therefore, I'm quite skeptical about integrating > NUMA-specific ma

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Tomas Vondra
On 7/7/25 16:51, Cédric Villemain wrote: >>> * Others might use it to integrate PostgreSQL's own resources (e.g., >>> "areas" of shared buffers) into policies. >>> >>> Hope this perspective is helpful. >> >> Can you explain how you want to manage this by an extension defined at >> the SQL level, wh

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Cédric Villemain
* Others might use it to integrate PostgreSQL's own resources (e.g., "areas" of shared buffers) into policies. Hope this perspective is helpful. Can you explain how you want to manage this by an extension defined at the SQL level, when most of this stuff has to be done when setting up shared me

Re: Adding basic NUMA awareness

2025-07-07 Thread Jakub Wartak
Hi Tomas, some more thoughts after the weekend: On Fri, Jul 4, 2025 at 8:12 PM Tomas Vondra wrote: > > On 7/4/25 13:05, Jakub Wartak wrote: > > On Tue, Jul 1, 2025 at 9:07 PM Tomas Vondra wrote: > > > > Hi! > > > >> 1) v1-0001-NUMA-interleaving-buffers.patch > > [..] > >> It's a bit more complic

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-07 Thread Tomas Vondra
On 7/5/25 09:09, Cédric Villemain wrote: > Hi Tomas, > > > I haven't yet had time to fully read all the work and proposals around > NUMA and related features, but I hope to catch up over the summer. > > However, I think it's important to share some thoughts before it's too > late, as you migh

Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach

2025-07-05 Thread Cédric Villemain
Hi Tomas, I haven't yet had time to fully read all the work and proposals around NUMA and related features, but I hope to catch up over the summer. However, I think it's important to share some thoughts before it's too late, as you might find them relevant to the NUMA management code. 6)

Re: Adding basic NUMA awareness

2025-07-04 Thread Tomas Vondra
On 7/4/25 13:05, Jakub Wartak wrote: > On Tue, Jul 1, 2025 at 9:07 PM Tomas Vondra wrote: > > Hi! > >> 1) v1-0001-NUMA-interleaving-buffers.patch > [..] >> It's a bit more complicated, because the patch distributes both the >> blocks and descriptors, in the same way. So a buffer and it's descrip

Re: Adding basic NUMA awareness

2025-07-04 Thread Jakub Wartak
On Tue, Jul 1, 2025 at 9:07 PM Tomas Vondra wrote: Hi! > 1) v1-0001-NUMA-interleaving-buffers.patch [..] > It's a bit more complicated, because the patch distributes both the > blocks and descriptors, in the same way. So a buffer and it's descriptor > always end on the same NUMA node. This is on

Re: Adding basic NUMA awareness

2025-07-03 Thread Dmitry Dolgov
> On Wed, Jul 02, 2025 at 05:07:28PM +0530, Ashutosh Bapat wrote: > > There's also the question how this is related to other patches affecting > > shared memory - I think the most relevant one is the "shared buffers > > online resize" by Ashutosh, simply because it touches the shared memory. > > I

Re: Adding basic NUMA awareness

2025-07-03 Thread Ashutosh Bapat
On Wed, Jul 2, 2025 at 6:06 PM Tomas Vondra wrote: > > I'm not sure how you're rebuilding the freelist. Presumably it can > contain buffers that are no longer valid (after shrinking). How is that > handled to not break anything? I think the NUMA variant would do exactly > the same thing, except th

Re: Adding basic NUMA awareness

2025-07-02 Thread Tomas Vondra
On 7/2/25 13:37, Ashutosh Bapat wrote: > On Wed, Jul 2, 2025 at 12:37 AM Tomas Vondra wrote: >> >> >> 3) v1-0003-freelist-Don-t-track-tail-of-a-freelist.patch >> >> Minor optimization. Andres noticed we're tracking the tail of buffer >> freelist, without using it. So the patch removes that. >>

Re: Adding basic NUMA awareness

2025-07-02 Thread Ashutosh Bapat
On Wed, Jul 2, 2025 at 12:37 AM Tomas Vondra wrote: > > > 3) v1-0003-freelist-Don-t-track-tail-of-a-freelist.patch > > Minor optimization. Andres noticed we're tracking the tail of buffer > freelist, without using it. So the patch removes that. > The patches for resizing buffers use the lastFreeB

Adding basic NUMA awareness

2025-07-01 Thread Tomas Vondra
Hi, This is a WIP version of a patch series I'm working on, adding some basic NUMA awareness for a couple parts of our shared memory (shared buffers, etc.). It's based on Andres' experimental patches he spoke about at pgconf.eu 2024 [1], and while it's improved and polished in various ways, it's s