On Mon, Jul 13, 2020 at 2:50 PM Peter Geoghegan <p...@bowt.ie> wrote: > Primarily in favor of escape hatch: > > Jeff, > DavidR, > Pavel, > Andres, > Robert ??, > Amit ?? > > Primarily in favor of hash_mem/hash_mem_multiplier: > > PeterG, > Tom, > Alvaro, > Tomas, > Justin, > DavidG, > Jonathan Katz > > There are clear problems with this summary, including for example the > fact that Robert weighed in before the hash_mem/hash_mem_multiplier > proposal was even on the table. What he actually said about it [1] > seems closer to hash_mem, so I feel that putting him in that bucket is > a conservative assumption on my part. Same goes for Amit, who warmed > to the idea of hash_mem_multiplier recently. (Though I probably got > some detail wrong, in which case please correct me.)
My view is: - I thought the problem we were trying to solve here was that, in v12, if the planner thinks that your hashagg will fit in memory when really it doesn't, you will get good performance because we'll cheat; in v13, you'll get VERY bad performance because we won't. - So, if hash_mem_multiplier affects both planning and execution, it doesn't really solve the problem. Neither does adjusting the existing work_mem setting. Imagine that you have two queries. The planner thinks Q1 will use 1GB of memory for a HashAgg but it will actually need 2GB. It thinks Q2 will use 1.5GB for a HashAgg but it will actually need 3GB. If you plan using a 1GB memory limit, Q1 will pick a HashAgg and perform terribly when it spills. Q2 will pick a GroupAggregate which will be OK but not great. If you plan with a 2GB memory limit, Q1 will pick a HashAgg and will not spill so now it will be in great shape. But Q2 will pick a HashAgg and then spill so it will stink. Oops. - An escape hatch that prevents spilling at execution time *does* solve this problem, but now suppose we add a Q3 which the planner thinks will use 512MB of memory but at execution time it will actually consume 512GB due to the row count estimate being 1024x off. So if you enable the escape hatch to get back to a situation where Q1 and Q2 both perform acceptably, then Q3 makes your system OOM. - If you were to instead introduce a GUC like what I proposed before, which allows the execution-time memory usage to exceed what was planned, but only by a certain margin, then you can set hash_mem_execution_overrun_multiplier_thingy=2.5 and call it a day. Now, no matter how you set work_mem, you're fine. Depending on the value you choose for work_mem, you may get group aggregates for some of the queries. If you set it large enough that you get hash aggregates, then Q1 and Q2 will avoid spilling (which works but is slow) because the overrun is less than 2x. Q3 will spill, so you won't OOM. Wahoo! - I do agree in general that it makes more sense to allow hash_work_mem > sort_work_mem, and even to make that the default. Allowing the same budget for both is unreasonable, because I think we have good evidence that inadequate memory has a severe impact on hashing operations but usually only a fairly mild effect on sorting operations, except in the case where the underrun is severe. That is, if you need 1GB of memory for a sort and you only get 768MB, the slowdown is much much less severe than if the same thing happens for a hash. If you have 10MB of memory, both are going to suck, but that's kinda unavoidable. - If you hold my feet to the fire and ask me to choose between a Boolean escape hatch (rather than a multiplier-based one) and hash_mem_multiplier, gosh, I don't know. I guess the Boolean escape hatch? I mean it's a pretty bad solution, but at least if I have that I can get both Q1 and Q2 to perform well at the same time, and I guess I'm no worse off than I was in v12. The hash_mem_multiplier thing, assuming it affects both planning and execution, seems like a very good idea in general, but I guess I don't see how it helps with this problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company