On Tue, 7 Jul 2020 at 16:57, Jeff Davis <pg...@j-davis.com> wrote: > > On Sun, 2020-07-05 at 16:47 -0700, Peter Geoghegan wrote: > > Where does that leave the hash_mem idea (or some other similar > > proposal)? > > hash_mem is acceptable to me if the consensus is moving toward that, > but I'm not excited about it.
FWIW, I'm not a fan of the hash_mem idea. It was my impression that we aimed to provide an escape hatch for people we have become accustomed to <= PG12 behaviour and hash_mem sounds like it's not that. Surely a GUC by that name would control what Hash Join does too? Otherwise, it would be called hashagg_mem. I'd say changing the behaviour of Hash join is not well aligned to the goal of allowing users to get something closer to what PG12 did. I know there has been talk over the years to improve how work_mem works. I see Tomas mentioned memory grants on the other thread [1]. I do imagine this is the long term solution to the problem where users must choose very conservative values for work_mem. We're certainly not going to get that for PG13, so I do think what we need here is just a simple escape hatch. I mentioned my thoughts in [2], so won't go over it again here. Once we've improved the situation in some future version of postgres, perhaps along the lines of what Tomas mentioned, then we can get rid of the escape hatch. Here are my reasons for not liking the hash_mem idea: 1. if it also increases the amount of memory that Hash Join can use then that makes the partition-wise hash join problem of hash_mem * npartitions even bigger when users choose to set hash_mem higher than work_mem to get Hash Agg doing what they're used to. 2. Someone will one day ask for sort_mem and then materialize_mem. Maybe then cte_mem. Once those are done we might as well just add a GUC to control every executor node that uses work_mem. 3. I'm working on a Result cache node [3]. It uses a hash table internally. Should it constraint its memory consumption according to hash_mem or work_mem? It's not really that obvious to people that it internally uses a hash table. "Hash" does not appear in the node name. Do people need to look that up in the documents? David [1] https://www.postgresql.org/message-id/20200626235850.gvl3lpfyeobu4evi@development [2] https://www.postgresql.org/message-id/CAApHDvqFZikXhAGW=ukzkq1_fzhy+xzmuzajinj6rwythh4...@mail.gmail.com [3] https://www.postgresql.org/message-id/caaphdvrpcqyqdwergywx8j+2dlungxu+fosbq1uscxrunyx...@mail.gmail.com