[PATCH] Move clause_sides_match_join() into pathnode.h

2024-10-09 Thread James Hunter
We had two almost-identical copies of the utility function clause_sides_match_join() -- one in joinpath.c, and one in analyzejoins.c. Both copies were marked "inline," so we might as well just move the (inline) function definition into a common header file. I chose pathnode.h, because it is already

Re: [PATCH] Move clause_sides_match_join() into pathnode.h

2024-10-10 Thread James Hunter
On Wed, Oct 9, 2024 at 5:26 PM David Rowley wrote: > > On Thu, 10 Oct 2024 at 08:38, James Hunter wrote: > > We had two almost-identical copies of the utility function > > clause_sides_match_join() -- one in joinpath.c, and one in > > analyzejoins.c. Both copies were mark

Re: Possible integer overflow in bringetbitmap()

2024-12-05 Thread James Hunter
On Wed, Dec 4, 2024 at 10:13 PM Michael Paquier wrote: > > totalpages is signed, and BlockNumber is unsigned. Hence in theory > you could always fall into a trap once totalpages gets higher than > (2^31 - 1), no? This is not going to be a problem in practice even if > the number of pages per ran

Re: Possible integer overflow in bringetbitmap()

2024-12-04 Thread James Hunter
On Tue, Nov 26, 2024 at 2:57 AM Evgeniy Gorbanyov wrote: > > Function bringetbitmap() in src/backend/access/brin/brin.c:560 returns type > int64. But the returned variable 'totalpages' is of type int. Maybe it makes > sense to change the type of variable 'totalpages' to int64 to avoid possible

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2025-01-03 Thread James Hunter
I think this discussion is getting away from a manageable scope of work... On Thu, Jan 2, 2025 at 1:09 PM Jim Nasby wrote: > That’s basically my argument for having workload management. If a system > becomes loaded enough for the global limit to start kicking in it’s likely > that query respon

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2025-01-03 Thread James Hunter
On Thu, Jan 2, 2025 at 7:21 PM Tomas Vondra wrote: > > I'm not opposed to having a some sort of "workload management" (similar > to what's available in some databases), but my guess is that's (at > least) an order of magnitude more complex than introducing the memory > limit discussed here. I can

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-30 Thread James Hunter
On Fri, Dec 27, 2024 at 5:48 PM Tomas Vondra wrote: > Whenever I've been thinking about this in the past, it wasn't clear to > me how would we know when to start adjusting work_mem, because we don't > know which nodes will actually use work_mem concurrently. You certainly know the PostgreSQL sou

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-27 Thread James Hunter
Reviving this thread, because I am thinking about something related -- please ignore the "On Fri, Dec 27, 2024" date, this seems to be an artifact of me re-sending the message, from the list archive. The original message was from January 28, 2024. On Fri, Dec 27, 2024 at 11:02 AM Tomas Vondra wro

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-30 Thread James Hunter
On Sat, Dec 28, 2024 at 6:57 AM Tomas Vondra wrote: > > On 12/28/24 13:36, Anton A. Melnikov wrote: > > > > ... In more details let me suggest > > the following steps or parts: > > 1) realize memory limitation for a separate backend independent from the > > work_mem GUC; > > 2) add workmem-like li

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-30 Thread James Hunter
On Sat, Dec 28, 2024 at 10:26 AM Jeremy Schneider wrote: > Thanks for the feedback, Jeremy! > While I don't have a detailed design in mind, I'd like to add a strong > +1 on the general idea that work_mem is hard to effectively use because > queries can vary so widely in how many nodes might need

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-30 Thread James Hunter
On Sat, Dec 28, 2024 at 11:24 PM Jim Nasby wrote: > > IMHO none of this will be very sane until we actually have cluster-level > limits. One sudden burst in active connections and you still OOM the instance. Fwiw, PG does support "max_connections" GUC, so a backend/connection - level limit, time

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-30 Thread James Hunter
On Mon, Dec 30, 2024 at 3:12 PM David Rowley wrote: > > On Sat, 28 Dec 2024 at 08:14, James Hunter wrote: > > 2. We use this backend_work_mem to "adjust" work_mem values used by > > the executor. (I don't care about the optimizer right now -- optimizer > >

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-30 Thread James Hunter
On Mon, Dec 30, 2024 at 2:56 PM David Rowley wrote: > > On Tue, 31 Dec 2024 at 10:11, James Hunter wrote: > > Does PostgreSQL currently rescan Hash Joins when they are "no longer > > needed," to free work_mem early? If so, then I would try to reuse this > > ex

Re: RFC: Packing the buffer lookup table

2025-01-31 Thread James Hunter
On Wed, Jan 29, 2025 at 11:49 PM Matthias van de Meent wrote: > > Hi, > > Some time ago I noticed that every buffer table entry is quite large at 40 > bytes (+8): 16 bytes of HASHELEMENT header (of which the last 4 bytes are > padding), 20 bytes of BufferTag, and 4 bytes for the offset into the

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-07 Thread James Hunter
On Fri, Feb 7, 2025 at 12:09 PM Tomas Vondra wrote: > ... > Yes, I think that's pretty much the idea. Except that I don't think we > need to look at the |F| at all - it will have more impact for small |F|, > of course, but it doesn't hurt for large |F|. > > I think it'll probably need to consider

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-07 Thread James Hunter
On Wed, Feb 5, 2025 at 4:23 AM Tomas Vondra wrote: > > If the requirement is that all "dimensions" only join to the fact table > (which in this example would be "A" I think) through a FK, then why > would these joins be illegal? > > ... > Essentially, this defines a "dimension" as a relation that

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-02-10 Thread James Hunter
On Fri, Jan 24, 2025 at 5:48 PM Jeff Davis wrote: > > On Fri, 2025-01-24 at 17:04 -0800, James Hunter wrote: > > Generating "high memory" vs. "low memory" paths would be tricky, > > because the definition of "high" vs. "low" depends on th

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-02-10 Thread James Hunter
Hunter wrote: > > On Fri, Jan 24, 2025 at 5:48 PM Jeff Davis wrote: > > > > On Fri, 2025-01-24 at 17:04 -0800, James Hunter wrote: > > > Generating "high memory" vs. "low memory" paths would be tricky, > > > because the definition of "high&

Re: AIO v2.3

2025-02-11 Thread James Hunter
On Mon, Feb 10, 2025 at 2:40 PM Thomas Munro wrote: > ... > Problem statement: You want to be able to batch I/O submission, ie > make a single call to ioring_enter() (and other mechanisms) to start > several I/Os, but the code that submits is inside StartReadBuffers() > and the code that knows how

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2025-01-08 Thread James Hunter
On Mon, Jan 6, 2025 at 1:07 PM Jim Nasby wrote: > > I’ve been saying “workload management” for lack of a better term, but my > initial suggestion upthread was to simply stop allowing new transactions to > start if global work_mem consumption exceeded some threshold. That’s > simplistic enough t

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-01-24 Thread James Hunter
On Wed, Jan 22, 2025 at 1:13 PM Tomas Vondra wrote: > > On 1/10/25 19:00, James Hunter wrote: > > ... > > I wouldn’t change the existing planning logic (at least not in the > > initial implementaton). So the existing planning logic would choose > > between differe

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-01-24 Thread James Hunter
On Tue, Jan 21, 2025 at 1:26 PM Jeff Davis wrote: > > On Fri, 2025-01-10 at 10:00 -0800, James Hunter wrote: > > How should “query_work_mem” work? Let’s start with an example: > > suppose > > we have an OLAP query that has 2 Hash Joins, and no other operators > > tha

Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-01-10 Thread James Hunter
I want customers to be able to run large OLAP queries on PostgreSQL, using as much memory as possible, to avoid spilling — without running out of memory. There are other ways to run out of memory, but the fastest and easiest way, on an OLAP query, is to use a lot of work_mem. (This is true for any

Re: Possible integer overflow in bringetbitmap()

2025-01-10 Thread James Hunter
On Fri, Dec 20, 2024 at 3:22 PM Michael Paquier wrote: > > On Tue, Dec 10, 2024 at 12:33:08PM +0900, Michael Paquier wrote: > > Sure, you could do (a) and (b) together. It also seems to me that it > > is just simpler to make totalpages a int64 to map automatically with > > the result expected by

Re: AIO v2.3

2025-02-12 Thread James Hunter
On Tue, Feb 11, 2025 at 1:44 PM Andres Freund wrote: > ... > Alternatively we could make pgaio_batch_begin() basically start a critical > section, but that doesn't seem like a good idea, because too much that needs > to happen around buffered IO isn't compatible with critical sections. > > > Does

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-02-11 Thread James Hunter
On Tue, Feb 11, 2025 at 10:00 AM Jeff Davis wrote: > > On Mon, 2025-02-10 at 19:09 -0800, James Hunter wrote: > > I think it makes sense to split the work into two parts: one part > > that > > improves SQL execution, and a second part that improves the > >

Re: a pool for parallel worker

2025-03-24 Thread James Hunter
On Tue, Mar 11, 2025 at 5:39 AM Andy Fan wrote: > Currently when a query needs some parallel workers, postmaster spawns > some backend for this query and when the work is done, the backend > exit. there are some wastage here, e.g. syscache, relcache, smgr cache, > vfd cache and fork/exit syscall

Re: pg_atomic_compare_exchange_*() and memory barriers

2025-03-24 Thread James Hunter
On Sat, Mar 8, 2025 at 7:21 AM Andres Freund wrote: > > FWIW, I am fairly certain that any non-toy algorithm that requires a full > memory barrier instead of just an acquire in case of a CAS failure is chock > full of concurrency bugs. Yeah -- off the top of my head, I can think of only two CAS p

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-04-10 Thread James Hunter
On Wed, Apr 9, 2025 at 11:00 PM Thomas Munro wrote: > > On Wed, Apr 9, 2025 at 1:46 PM James Hunter wrote: > > On Mon, Apr 7, 2025 at 7:34 PM Thomas Munro wrote: > > > On Thu, Feb 13, 2025 at 1:40 PM Melanie Plageman > > > wrote: > > > > Thomas menti

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-04-08 Thread James Hunter
On Mon, Apr 7, 2025 at 7:34 PM Thomas Munro wrote: > > On Thu, Feb 13, 2025 at 1:40 PM Melanie Plageman > wrote: > > Thomas mentioned this to me off-list, and I think he's right. We > > likely need to rethink the way parallel bitmap heap scan workers get > > block assignments for reading and pref

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-04-14 Thread James Hunter
On Thu, Apr 10, 2025 at 8:15 PM Thomas Munro wrote: > > On Fri, Apr 11, 2025 at 5:50 AM James Hunter > wrote: > > I am looking at the pre-streaming code, in PG 17, as I am not familiar > > with the PG 18 "streaming" code. Back in PG 17, nodeBitmapHeapscan.

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-04-15 Thread James Hunter
Thanks for the comments! On Tue, Apr 15, 2025 at 3:11 AM Andres Freund wrote: > > Hi, > > On 2025-04-14 09:58:19 -0700, James Hunter wrote: > > I see two orthogonal problems, in processing Bitmap Heap pages in > > parallel: (1) we need to prefetch enough pages, far e

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-02-27 Thread James Hunter
nk the patches would be as short as you expect. For example, if Agg nodes behaved as in your example, quoted at the top of this email, then we wouldn't need Patch 2's additional logic to assign workmem_limit to Aggs (and we wouldn't need the corresponding logic in Patch 3, to assign workmem (estimate) to Aggs, either). But Aggs aren't as simple as in your example -- they have Hash limits and Sort limits; they have a side-chain of Agg nodes; they have input sets they need to Sort; etc. And so we need a couple dozen lines of code to handle them. Thanks for the feedback, James Hunter

Re: Should work_mem be stable for a prepared statement?

2025-03-05 Thread James Hunter
runtime, that the planner's assumptions were so wrong that they'll lead us to execute a sub-optimal plan, then maybe we can re-plan. But I explicitly wouldn't re-plan a prepared statement, since the customer's expectation is that it has already been prepared. James Hunter [1] https://www.postgresql.org/message-id/flat/CAJVSvF5n3_uEGW5GZSRehDuTfz7XVDohbn7tVJ%2B2ZnweQEVFrQ%40mail.gmail.com#abc6e69a396bb9f6505bf33260670a1f

Re: Make tuple deformation faster

2025-03-05 Thread James Hunter
be a byte, and memory is byte-addressable. For a bitfield, however, the CPU has to read from or write to the byte that contains the bit, but then it also has to mask out the *other* bits in that bitfield. This is a data dependency, so it stalls the CPU pipeline. So Booleans tend to be faster than bitfields, because they avoid a pipeline stall. James Hunter

Re: int64 support in List API

2025-03-03 Thread James Hunter
On Sun, Jan 19, 2025 at 9:12 PM Yura Sokolov wrote: > > 20.01.2025 07:36, Tom Lane пишет: ... > > This has been discussed before, and we've felt that it wasn't worth > > the additional code duplication. I would not favor approaching this > > with the mindset of lets-copy-and-paste-all-the-code. >

Re: Make tuple deformation faster

2025-03-05 Thread James Hunter
On Wed, Mar 5, 2025 at 12:16 PM Jeff Davis wrote: > > On Wed, 2025-03-05 at 11:33 -0800, James Hunter wrote: > > For a bitfield, however, the CPU has to read from or write to the > > byte > > that contains the bit, but then it also has to mask out the *other* > > bi

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-02-24 Thread James Hunter
On Mon, Feb 24, 2025 at 6:54 PM Jeff Davis wrote: > > On Mon, 2025-02-24 at 12:46 -0800, James Hunter wrote: > > Attached please find the patch set I mentioned, above, in [1]. It > > consists of 4 patches that serve as the building blocks for and a > > prototype of th

Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators

2025-02-24 Thread James Hunter
quot;workmem_limit" limit. > Can you write a useful extension with just the above two core patches? I think so; I will attach a patch for that as well.. (This will be "Patch 4"; note that "Patch 2" is a prerequisite for "Patch 3".) > Regards, > Jeff Davis Thanks, James Hunter

Re: a very significant fraction of the buildfarm is now pink

2025-02-26 Thread James Hunter
On Fri, Feb 21, 2025 at 5:09 PM Robert Haas wrote: > > On Fri, Feb 21, 2025 at 7:04 PM Tom Lane wrote: > > A very significant fraction of the buildfarm is now pink. > > If you don't have a fix pretty nearly ready, please revert. > > When we're going to do a release, you want no commits for at lea

Re: [PATCH] Optimize SP-GiST text leaf comparisons with memcmp

2025-02-26 Thread James Hunter
On Wed, Feb 12, 2025 at 5:14 PM ahmedashour wrote: > > Thank you for your feedback and review! I have addressed the comments and > updated the patch accordingly. Below are the details of the changes: > > Please review the updated patch and let me know if there are any further > issues or suggest

Re: Adjusting hash join memory limit to handle batch explosion

2025-02-25 Thread James Hunter
On Wed, Feb 19, 2025 at 12:22 PM Tomas Vondra wrote: > > I've pushed the first (and main) part of the patch series, after some > more cleanup and comment polishing. Two comments on your merged patch -- First, it's easier to see what's going on if we overlook the logic to round to nearest power o

Re: Adjusting hash join memory limit to handle batch explosion

2025-02-25 Thread James Hunter
On Tue, Feb 25, 2025 at 9:39 AM Tomas Vondra wrote: > > On 2/25/25 17:30, James Hunter wrote: > > On Wed, Feb 19, 2025 at 12:22 PM Tomas Vondra wrote: > > -- OK, but the customer *didn't* set their workmem to 32 MB. (If they > > had, we wouldn't need this pa