Re: Default setting for enable_hashagg_disk

Jeff Davis Thu, 25 Jun 2020 10:45:41 -0700

On Thu, 2020-06-25 at 09:37 -0700, Andres Freund wrote:
> > Let's say you have work_mem=32MB and a query that's expected to use
> > 16MB of memory. In reality, it uses 64MB of memory. So you are
> > saying
> > this query would get to use all 64MB of memory, right?
> > 
> > But then you run ANALYZE. Now the query is (correctly) expected to
> > use
> > 64MB of memory. Are you saying this query, executed again with
> > better
> > stats, would only get to use 32MB of memory, and therefore run
> > slower?
> 
> Yes. I think that's ok, because it was taken into account from a
> costing
> perspective int he second case.


What do you mean by "taken into account"?

There are only two possible paths: HashAgg and Sort+Group, and we need
to pick one. If the planner expects one to spill, it is likely to
expect the other to spill. If one spills in the executor, then the
other is likely to spill, too. (I'm ignoring the case with a lot of
tuples and few groups because that doesn't seem relevant.)

Imagine that there was only one path available to choose. Would you
suggest the same thing, that unexpected spills can exceed work_mem but
expected spills can't?
Regards,
        Jeff Davis

Re: Default setting for enable_hashagg_disk

Reply via email to