On Mon, May 6, 2024 at 4:30 PM Peter Geoghegan <p...@bowt.ie> wrote: > FWIW I always found those weird inconsistencies to be annoying at > best, and confusing at worst. I speak as somebody that uses > disable_cost a lot. > > I certainly wouldn't ask anybody to make it a priority for that reason > alone -- it's not *that* bad. I've given my opinion on this because > it's already under discussion.
Thanks, it's good to have other perspectives. Here are some patches for discussion. 0001 gets rid of disable_cost as a mechanism for forcing a TID scan plan to be chosen when CurrentOfExpr is present. Instead, it arranges to generate only the valid path when that case occurs, and skip everything else. I think this is a good cleanup, and it doesn't seem totally impossible that it actually prevents a failure in some extreme case. 0002 cleans up the behavior of enable_indexscan and enable_indexonlyscan. Currently, setting enable_indexscan=false adds disable_cost to both the cost of index scans and the cost of index-only scans. I think that's indefensible and, in fact, a bug, although I believe David Rowley disagrees. With this patch, we simply don't generate index scans if enable_indexscan=false, and we don't generate index-only scans if enable_indexonlyscan=false, which seems a lot more consistent to me. However, I did revise one major thing from the patch I posted before, per feedback from David Rowley and also per my own observations: in this version, if enable_indexscan=true and enable_indexonlyscan=false, we'll generate index-scan paths for any cases where, with both set to true, we would have only generated index-only scan paths. That change makes the behavior of this patch a lot more comprehensible and intuitive: the only regression test changes are places where somebody expected that they could disable both index scans and index-only scans by setting enable_indexscan=false. 0003 and 0004 extend the approach of "just don't generate the disabled path" to bitmap scans and gather merge, respectively. I think these are more debatable, mostly because it's not clear how far we can really take this approach. Neither breaks any test cases, and 0003 is closely related to the work done in 0002, which seems like a point in its favor. 0004 was simply the only other case where it was obvious to me that this kind of approach made sense. In my view, it makes most sense to use this kind of approach for planner behaviors that seem like they're sort of optional: like if you don't use gather merge, you can still use gather, and if you don't use index scans, you can still use sequential scans. With all these patches applied, the remaining cases where we rely on disable_cost are: sequential scans sorts hash aggregation all 3 join types hash joins where a bucket holding the inner MCV would exceed hash_mem Sequential scans are clearly a last-ditch method. I find it a bit hard to decide whether hashing or sorting is the default, especially giving the asymmetry between enable_sort - presumptively anywhere - and enable_hashagg - specific to aggregation. As for the join types, it's tempting to consider nested-loop the default type -- it's the only way to satisfy parameterizations, for instance -- but the fact that it's the only method that can't do a full join undermines that position in my book. But, I don't want to pretend like I have all the answers here, either; I'm just sharing some thoughts. -- Robert Haas EDB: http://www.enterprisedb.com
0004-When-enable_gathermerge-false-don-t-generate-gather-.patch
Description: Binary data
0002-Don-t-generate-index-scan-paths-when-enable_indexsca.patch
Description: Binary data
0003-When-enable_bitmapscan-false-just-don-t-generate-bit.patch
Description: Binary data
0001-Remove-grotty-use-of-disable_cost-for-TID-scan-plans.patch
Description: Binary data