On Fri, Jul 10, 2020 at 10:00 PM David Rowley <dgrowle...@gmail.com> wrote: > hmm yeah. It's unfortunate, but I'm not sure how I'd have implemented > it differently. The problem is made worse by the fact that we'll only > release the memory for the hash table during ExecEndHashJoin(). If the > planner had some ability to provide the executor with knowledge that > the node would never be rescanned, then the executor could release the > memory for the hash table after the join is complete. For now, we'll > need to live with the fact that an Append containing many children > doing hash joins will mean holding onto all that memory until the > executor is shutdown :-( > > There's room to make improvements there, for sure, but not for PG13.
I think that we're stuck with the idea that partitionwise join uses up to one work_mem allocation per partition until we deprecate work_mem as a concept. Anyway, I only talked about partitionwise join because that was your example. I could just as easily have picked on parallel hash join instead, which is something that I was involved in myself (kind of). This is more or less a negative consequence of the incremental approach we have taken here, which is a collective failure. I have difficulty accepting that something like hash_mem_multiplier cannot be accepted because it risks making the consequence of questionable designs even worse. The problem remains that the original assumption just isn't very robust, and isn't something that the user has granular control over. In general it makes sense that a design in a stable branch is assumed to be the norm that new things need to respect, and not the other way around. But there must be some limit to how far that's taken. > It sounds interesting, but it also sounds like a new feature > post-beta. Perhaps it's better we minimise the scope of the change to > be a minimal fix just for the behaviour we predict some users might > not like. That's an understandable interpretation of the hash_mem/hash_mem_multiplier proposal on the table, and yet one that I disagree with. I consider it highly desirable to have a GUC that can be tuned in a generic and high level way, on general principle. We don't really do escape hatches, and I'd rather avoid adding one now (though it's far preferable to doing nothing, which I consider totally out of the question). Pursuing what you called hashagg_mem is a compromise that will make neither of us happy. It seems like an escape hatch by another name. I would rather just go with your original proposal instead, especially if that's the only thing that'll resolve the problem in front of us. -- Peter Geoghegan