Re: Default setting for enable_hashagg_disk

Stephen Frost Sat, 11 Jul 2020 07:23:14 -0700

Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost <sfr...@snowman.net> writes:
> > I don't see hash_mem as being any kind of proper fix- it's just punting
> > to the user saying "we can't figure this out, how about you do it" and,
> > worse, it's in conflict with how we already ask the user that question.
> > Turning it into a multiplier doesn't change that either.
> 
> Have you got a better proposal that is reasonably implementable for v13?
> (I do not accept the argument that "do nothing" is a better proposal.)
> 
> I agree that hash_mem is a stopgap, whether it's a multiplier or no,
> but at this point it seems difficult to avoid inventing a stopgap.
> Getting rid of the process-global work_mem setting is a research project,
> and one I wouldn't even count on having results from for v14.  In the
> meantime, it seems dead certain that there are applications for which
> the current behavior will be problematic.  hash_mem seems like a cleaner
> and more useful stopgap than the "escape hatch" approach, at least to me.


Have we heard from people running actual applications where there is a
problem with raising work_mem to simply match what's already happening
with the v12 behavior?

Sure, there's been some examples on this thread of people who know the
backend well showing how the default work_mem will cause the v13 HashAgg
to spill to disk when given a query which has poor estimates, and that's
slower than v12 where it ignored work_mem and used a bunch of memory,
but it was also shown that raising work_mem addresses that issue and
brings v12 and v13 back in line.

There was a concern raised that other nodes might then use more memory-
but there's nothing new there, if you wanted to avoid batching with a
HashJoin in v12 you'd have exactly the same issue, and yet folks raise
work_mem all the time to address this, and to get that HashAgg plan in
the first place too when the estimates aren't so far off.

There now seems to be some suggestions that not only should we have a
new GUC, but we should default to having it not be equal to work_mem (or
1.0 or whatever) and instead by higher, to be *twice* or larger whatever
the existing work_mem setting is- meaning that people whose systems are
working just fine and have good estimates that represent their workload
and who get the plans they want may then start seeing differences and
increased memory utilization in places that they *don't* want that, all
because we're scared that someone, somewhere, might see a regression due
to HashAgg spilling to disk.

So, no, I don't agree that 'do nothing' (except ripping out the one GUC
that was already added) is a worse proposal than adding another work_mem
like thing that's only for some nodes types.  There's no way that we'd
even be considering such an approach during the regular development
cycle either- there would be calls for a proper wholistic view, at least
to the point where every node type that could possibly allocate a
reasonable chunk of memory would be covered.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Default setting for enable_hashagg_disk

Reply via email to