Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

Robert Haas Fri, 12 Sep 2014 14:24:02 -0700

On Fri, Sep 12, 2014 at 4:55 PM, Tomas Vondra <t...@fuzzy.cz> wrote:
>> I'm actually quite surprised that you find batching to be a better
>> strategy than skimping on buckets, because I would have expect the
>> opposite, almost categorically. Batching means having to write out
>> the tuples we can't process right away and read them back in. If
>> that involves disk I/O, I think the cost of that I/O is going to be
>> FAR more than the overhead we would have incurred by searching
>> slightly longer bucket chains. If it didn't, then you could've set
>> work_mem higher and avoided batching in the first place.
>
> No, I don't find batching to be a better strategy. I just think this
> really needs more discussion than a simple "let's use NTUP_PER_BUCKET=4
> to avoid batching" follow-up patch.
>
> For example, let's say we switch to NTUP_PER_BUCKET=4 to avoid batching,
> and then discover we need to start batching anyway. Should we keep the
> settings, or should we revert NTUP_PER_BUCKET=1? Or maybe not doing that
> for nbatch=2, but for nbatch=16?


My first thought is to revert to NTUP_PER_BUCKET=1, but it's certainly
arguable.  Either method, though, figures to be better than doing
nothing, so let's do something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

Reply via email to