On Sat, 11 Jul 2020 at 14:02, Peter Geoghegan <p...@bowt.ie> wrote: > > On Fri, Jul 10, 2020 at 6:19 PM David Rowley <dgrowle...@gmail.com> wrote: > > If we have to have a new GUC, my preference would be hashagg_mem, > > where -1 means use work_mem and a value between 64 and MAX_KILOBYTES > > would mean use that value. We'd need some sort of check hook to > > disallow 0-63. I really am just failing to comprehend why we're > > contemplating changing the behaviour of Hash Join here. > > I don't understand why parititonwise hash join consumes work_mem in > the way it does. I assume that the reason is something like "because > that behavior was the easiest to explain", or perhaps "people that use > partitioning ought to be able to tune their database well". Or even > "this design avoids an epic pgsql-hackers thread, because of course > every hash table should get its own work_mem".
hmm yeah. It's unfortunate, but I'm not sure how I'd have implemented it differently. The problem is made worse by the fact that we'll only release the memory for the hash table during ExecEndHashJoin(). If the planner had some ability to provide the executor with knowledge that the node would never be rescanned, then the executor could release the memory for the hash table after the join is complete. For now, we'll need to live with the fact that an Append containing many children doing hash joins will mean holding onto all that memory until the executor is shutdown :-( There's room to make improvements there, for sure, but not for PG13. > > Of course, I > > understand that that node type also uses a hash table, but why does > > that give it the right to be involved in a change that we're making to > > try and give users the ability to avoid possible regressions with Hash > > Agg? > > It doesn't, exactly. The idea of hash_mem came from similar settings > in another database system that you'll have heard of, that affect all > nodes that use a hash table. I read about this long ago, and thought > that it might make sense to do something similar as a way to improving > work_mem It sounds interesting, but it also sounds like a new feature post-beta. Perhaps it's better we minimise the scope of the change to be a minimal fix just for the behaviour we predict some users might not like. David