On Fri, Sep 12, 2014 at 4:55 PM, Tomas Vondra <t...@fuzzy.cz> wrote: >> I'm actually quite surprised that you find batching to be a better >> strategy than skimping on buckets, because I would have expect the >> opposite, almost categorically. Batching means having to write out >> the tuples we can't process right away and read them back in. If >> that involves disk I/O, I think the cost of that I/O is going to be >> FAR more than the overhead we would have incurred by searching >> slightly longer bucket chains. If it didn't, then you could've set >> work_mem higher and avoided batching in the first place. > > No, I don't find batching to be a better strategy. I just think this > really needs more discussion than a simple "let's use NTUP_PER_BUCKET=4 > to avoid batching" follow-up patch. > > For example, let's say we switch to NTUP_PER_BUCKET=4 to avoid batching, > and then discover we need to start batching anyway. Should we keep the > settings, or should we revert NTUP_PER_BUCKET=1? Or maybe not doing that > for nbatch=2, but for nbatch=16?
My first thought is to revert to NTUP_PER_BUCKET=1, but it's certainly arguable. Either method, though, figures to be better than doing nothing, so let's do something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers