On Tue, Feb 25, 2025 at 9:39 AM Tomas Vondra wrote:
>
> On 2/25/25 17:30, James Hunter wrote:
> > On Wed, Feb 19, 2025 at 12:22 PM Tomas Vondra wrote:
> > -- OK, but the customer *didn't* set their workmem to 32 MB. (If they
> > had, we wouldn't need this patch -- but we *do* need this patch, wh
On 2/25/25 17:30, James Hunter wrote:
> On Wed, Feb 19, 2025 at 12:22 PM Tomas Vondra wrote:
>>
>> I've pushed the first (and main) part of the patch series, after some
>> more cleanup and comment polishing.
>
> Two comments on your merged patch --
>
> First, it's easier to see what's going on i
On Wed, Feb 19, 2025 at 12:22 PM Tomas Vondra wrote:
>
> I've pushed the first (and main) part of the patch series, after some
> more cleanup and comment polishing.
Two comments on your merged patch --
First, it's easier to see what's going on if we overlook the logic to
round to nearest power o
I've pushed the first (and main) part of the patch series, after some
more cleanup and comment polishing.
As explained in my previous message, I'm not sure about 0002. I don't
know if we need to worry about it (no reports AFAICS). And while the
patch works I'm not sure it's the best fix, or whethe
On Thu, Feb 6, 2025 at 1:48 PM Tomas Vondra wrote:
>
> Hi,
>
> Here's a slightly simplified version of the "balancing" patch. I decided
> to stop increasing the nbucket value at runtime, even if the hashtable
> grows larger than the memory limit (which is what we used to calculate
> the initial nb
On 1/13/25 17:32, Melanie Plageman wrote:
> On Sat, Jan 11, 2025 at 7:42 PM Tomas Vondra wrote:
>>
>> I had a quiet evening yesterday, so I decided to take a stab at this and
>> see how hard would it be, and how bad would the impact be. Attached is
>> an experimental patch, doing the *bare* minimu
On Sat, Jan 11, 2025 at 7:42 PM Tomas Vondra wrote:
>
> I had a quiet evening yesterday, so I decided to take a stab at this and
> see how hard would it be, and how bad would the impact be. Attached is
> an experimental patch, doing the *bare* minimum for a simple query:
>
> 1) It defines a limit
On 1/10/25 15:54, Melanie Plageman wrote:
> On Thu, Jan 9, 2025 at 6:59 PM Tomas Vondra wrote:
>>
>> ...
>
>> Robert's idea kept using buffered files, but limited how many we can
>> fill at any phase. Say we'd use a limit of 1024 batches, but we actually
>> need 1M batches. Then we'd do the build
On 1/11/25 00:09, Melanie Plageman wrote:
> On Fri, Jan 10, 2025 at 11:18 AM Tomas Vondra wrote:
>>
>> On 1/10/25 15:54, Melanie Plageman wrote:
>>> On Thu, Jan 9, 2025 at 6:59 PM Tomas Vondra wrote:
>>> I think this is because we get the batch based on
>>>
>>> *batchno = pg_rotate_right32(has
On Fri, Jan 10, 2025 at 11:18 AM Tomas Vondra wrote:
>
> On 1/10/25 15:54, Melanie Plageman wrote:
> > On Thu, Jan 9, 2025 at 6:59 PM Tomas Vondra wrote:
> > I think this is because we get the batch based on
> >
> > *batchno = pg_rotate_right32(hashvalue, hashtable->log2_nbuckets) &
> > (nbatch -
On 1/10/25 15:54, Melanie Plageman wrote:
> On Thu, Jan 9, 2025 at 6:59 PM Tomas Vondra wrote:
>>
>>
>>
>> On 1/9/25 21:42, Melanie Plageman wrote:
>>>
>>> I was excited about your raw file experiment. As Robert and you point
>>> out -- we may need a file per batch, but for most of the hash joi
On Thu, Jan 9, 2025 at 6:59 PM Tomas Vondra wrote:
>
>
>
> On 1/9/25 21:42, Melanie Plageman wrote:
> >
> > I was excited about your raw file experiment. As Robert and you point
> > out -- we may need a file per batch, but for most of the hash join's
> > execution we don't need to keep buffers for
On 1/9/25 21:42, Melanie Plageman wrote:
> On Tue, Dec 31, 2024 at 6:07 PM Tomas Vondra wrote:
>>
>> This means that ultimately it's either (1) or (3), and the more I've
>> been looking into this the more I prefer (1), for a couple reasons:
>>
>> * It's much simpler (it doesn't really change an
On 1/9/25 23:18, Melanie Plageman wrote:
> On Sun, Jan 5, 2025 at 10:00 PM Tomas Vondra wrote:
>>
>> I think the general idea and formula explained in [1] is right, but
>> while working on the PoC patch I started to think about how to formalize
>> this. And I ended up creating two tables that I th
On Sun, Jan 5, 2025 at 10:00 PM Tomas Vondra wrote:
>
> I think the general idea and formula explained in [1] is right, but
> while working on the PoC patch I started to think about how to formalize
> this. And I ended up creating two tables that I think visualize is
> pretty nicely.
>
> Imagine a
On Tue, Dec 31, 2024 at 6:07 PM Tomas Vondra wrote:
>
> This means that ultimately it's either (1) or (3), and the more I've
> been looking into this the more I prefer (1), for a couple reasons:
>
> * It's much simpler (it doesn't really change anything on the basic
> behavior, doesn't introduce a
On 1/9/25 17:17, Melanie Plageman wrote:
> On Tue, Dec 31, 2024 at 6:07 PM Tomas Vondra wrote:
>>
>> So I decided to revisit the three patches from 2019. Attached are
>> rebased and cleaned up versions. A couple comments on each one:
>>
>>
>> 1) v20241231-adjust-limit-0001-Account-for-batch-fil
On Tue, Dec 31, 2024 at 6:07 PM Tomas Vondra wrote:
>
> So I decided to revisit the three patches from 2019. Attached are
> rebased and cleaned up versions. A couple comments on each one:
>
>
> 1) v20241231-adjust-limit-0001-Account-for-batch-files-in-ha.patch
>
> I believe this is the way to go,
On 1/6/25 19:50, Robert Haas wrote:
> On Mon, Jan 6, 2025 at 11:51 AM Tomas Vondra wrote:
>> I wonder if maybe a better solution would be to allow BufFiles with
>> smaller buffers, not just hard-coded 8kB. OTOH I'm not sure how much
>> that helps, before the buffering stops being effective as t
On Mon, Jan 6, 2025 at 11:51 AM Tomas Vondra wrote:
> I wonder if maybe a better solution would be to allow BufFiles with
> smaller buffers, not just hard-coded 8kB. OTOH I'm not sure how much
> that helps, before the buffering stops being effective as the buffer
> gets smaller. I mean, we only ha
On 1/6/25 16:42, Robert Haas wrote:
> Hi Tomas,
>
> Thanks for working on this. I haven't studied this problem recently,
> but here are some ideas that occur to me:
>
> 1. Try to reduce the per-batch overhead.
>
Yeah. The "use files without buffering" approach may be seen as an
extreme versi
Hi Tomas,
Thanks for working on this. I haven't studied this problem recently,
but here are some ideas that occur to me:
1. Try to reduce the per-batch overhead.
2. Stop increasing the number of batches when the per-batch overhead
exceeds a small percentage of work_mem (10%? 5%? 1%?).
If you've
22 matches
Mail list logo