Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-02-07 Thread Jeff Davis
On Wed, 2025-01-15 at 21:01 +1300, David Rowley wrote: > On Tue, 14 Jan 2025 at 09:09, Jeff Davis wrote: > > Attached v2. > > This needs to be rebased due to b4a07f532. I moved the patch to the other thread here: https://www.postgresql.org/message-id/09b325921e50bc3a3217fb01d8eb512c89ee36f1.ca.

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-15 Thread David Rowley
On Tue, 14 Jan 2025 at 09:09, Jeff Davis wrote: > Attached v2. This needs to be rebased due to b4a07f532. The following repalloc was removed by that commit. - totalsize = MAXALIGN(firstTuple->t_len) + hashtable->additionalsize; - firstTuple = repalloc(firstTuple, totalsize); + totalsize = MAXAL

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-13 Thread Jeff Davis
On Fri, 2025-01-10 at 23:30 +1300, David Rowley wrote: > Since bump.c does not add headers to the palloc'd chunks, I think the > following code from hash_agg_entry_size() shouldn't be using > CHUNKHDRSZ anymore. Fixed. I also tried to account for the power-of-two allocations for the transition va

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-10 Thread David Rowley
On Thu, 9 Jan 2025 at 09:50, Jeff Davis wrote: > Attached POC patch, which reduces memory usage by ~15% for a simple > distinct query on an integer key. Performance is the same or perhaps a > hair faster. > > It's not many lines of code, but the surrounding code might benefit > from some refactori

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-08 Thread Jeff Davis
On Mon, 2025-01-06 at 12:34 -0800, Jeff Davis wrote: > If we separate out 4, we can use the Bump allocator for 2 & 3. Attached POC patch, which reduces memory usage by ~15% for a simple distinct query on an integer key. Performance is the same or perhaps a hair faster. It's not many lines of code

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-06 Thread Jeff Davis
On Sat, 2025-01-04 at 09:24 +0700, John Naylor wrote: > FYI, there is a proposal for that at > https://www.postgresql.org/message-id/817d244237878cebdff0bc363718feaf49a1ea7d.ca...@j-davis.com I had intended to commit some of those patches soon, so if someone sees a conflict with the ideas in this

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-06 Thread Jeff Davis
On Thu, 2025-01-02 at 13:38 +1300, David Rowley wrote: > On Thu, 2 Jan 2025 at 13:33, Tom Lane wrote: > > > > David Rowley writes: > > > I think what would be more interesting is seeing if we can store > > > the > > > TupleHashEntryData.firstTuple in a bump context. > > > > Are you saying the s

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-03 Thread John Naylor
On Thu, Jan 2, 2025 at 7:24 AM David Rowley wrote: > > Bump wouldn't work due to the SH_FREE() in SH_GROW() when resizing the > table. If sizeof(TupleHashEntryData) were a power-of-two, then there'd > be no wastage as the hash table always has a power-of-two bucket count > and two powers-of-two mu

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-01 Thread David Rowley
On Thu, 2 Jan 2025 at 13:33, Tom Lane wrote: > > David Rowley writes: > > I think what would be more interesting is seeing if we can store the > > TupleHashEntryData.firstTuple in a bump context. > > Are you saying the same as above, or something different? I thought you only meant store the has

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-01 Thread Tom Lane
David Rowley writes: > On Thu, 2 Jan 2025 at 12:18, Tom Lane wrote: >> I thought for a bit about whether we shouldn't try to account for >> palloc power-of-2-block-size overhead here. That omission would >> typically be a far larger error than the one you are fixing. However, >> given that the

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-01 Thread David Rowley
On Thu, 2 Jan 2025 at 12:18, Tom Lane wrote: > I thought for a bit about whether we shouldn't try to account for > palloc power-of-2-block-size overhead here. That omission would > typically be a far larger error than the one you are fixing. However, > given that the inputs to hash_agg_entry_siz

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-01 Thread Tom Lane
David Rowley writes: > While reading nodeAgg.c, I noticed code that uses CHUNKHDRSZ to help > figure out how much memory a tuple uses so the code knows when to > spill to disk. CHUNKHDRSZ is currently set to 16 bytes, which was > fine when that code was added, but it's a few years out-of-date sin

Incorrect CHUNKHDRSZ in nodeAgg.c

2025-01-01 Thread David Rowley
While reading nodeAgg.c, I noticed code that uses CHUNKHDRSZ to help figure out how much memory a tuple uses so the code knows when to spill to disk. CHUNKHDRSZ is currently set to 16 bytes, which was fine when that code was added, but it's a few years out-of-date since c6e0fe1f2 in 2022. The att