On Thu, 2020-04-09 at 15:26 -0400, Robert Haas wrote: > I think it's actually pretty different. All of the other enable_* > GUCs > disable an entire type of plan node, except for cases where that > would > otherwise result in planning failure. This just disables a portion of > the planning logic for a certain kind of node, without actually > disabling the whole node type. I'm not sure that's a bad idea, but it > definitely seems to be inconsistent with what we've done in the past.
The patch adds two GUCs. Both are slightly weird, to be honest, but let me explain the reasoning. I am open to other suggestions. 1. enable_hashagg_disk (default true): This is essentially there just to get some of the old behavior back, to give people an escape hatch if they see bad plans while we are tweaking the costing. The old behavior was weird, so this GUC is also weird. Perhaps we can make this a compatibility GUC that we eventually drop? I don't necessarily think this GUC would make sense, say, 5 versions from now. I'm just trying to be conservative because I know that, even if the plans are faster for 90% of people, the other 10% will be unhappy and want a way to work around it. 2. enable_groupingsets_hash_disk (default false): This is about how we choose which grouping sets to hash and which to sort when generating mixed mode paths. Even before this patch, there are quite a few paths that could be generated. It tries to estimate the size of each grouping set's hash table, and then see how many it can fit in work_mem (knapsack), while also taking advantage of any path keys, etc. With Disk-based Hash Aggregation, in principle we can generate paths representing any combination of hashing and sorting for the grouping sets. But that would be overkill (and grow to a huge number of paths if we have more than a handful of grouping sets). So I think the existing planner logic for grouping sets is fine for now. We might come up with a better approach later. But that created a testing problem, because if the planner estimates correctly, no hashed grouping sets will spill, and the spilling code won't be exercised. This GUC makes the planner disregard which grouping sets' hash tables will fit, making it much easier to exercise the spilling code. Is there a better way I should be testing this code path? Regards, Jeff Davis