Hi, This patch is a WIP fix for the issue described in [1], where the planner picks a more expensive plan with partition-wise joins enabled, and disabling this option produces a cheaper plan. That's a bit strange because with the option disabled we consider *fewer* plans, so we should not be able to generate a cheaper plan.
The problem lies in generate_orderedappend_paths(), which builds two types of append paths - with minimal startup cost, and with minimal total cost. That however does not work for queries with LIMIT, which also need to consider at fractional cost, but the path interesting from this perspective may be eliminated by other paths. Consider three paths (this comes from the reproducer shared in [1]): A: nestloop_path startup 0.585000 total 35708.292500 B: nestloop_path startup 0.292500 total 150004297.292500 C: mergejoin_path startup 9748.112737 total 14102.092737 With some reasonable LIMIT value (e.g. 5% of the data), we really want to pick path A. But that path is dominated both in startup cost (by B) and total cost (path C). Hence generate_orderedappend_paths() will ignore it, and we'll generate a more expensive LIMIT plan. In [2] Tom proposed to modify generate_orderedappend_paths() to also consider the fractional cheapest_path, just like we do for startup and total costs. The attached patch does exactly that, and it does seem to do the trick. There are some loose ends, though: 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not clear whether to default to cheapest_startup or cheapest_total. We might also consider an incremental sort, in which case the startup cost matters (for Sort it does not). So we may need an enhanced version of get_cheapest_fractional_path_for_pathkeys that generates such paths. 2) Same for the cheapest_total - maybe there's a partially sorted path, and using it with incremental sort on top would be better than using cheapest_total_path + sort. 3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry about require_parallel_safe, just like the other functions nearby. I'd argue that this patch does not need to add the Incremental Sort capabilities in (1) and (2) - it's just another place where we decided not to consider this sort variant yet. I'm not sure how much this depends on partition-wise join, and why disabling it generates the right plan. The reproducer uses that, but AFAICS generate_orderedappend_paths() works like this since 2010 (commit 11cad29c915). I'd bet the issue exists since then and it's possible to get similar cases even without partition-wise join. I can reproduce it on PostgreSQL 12, though (which however supports partition-wise join). Not sure whether this should be backpatched. We didn't get any reports until now, so it doesn't seem like a pressing issue. OTOH most users won't actually notice this, they'll just get worse plans without realizing there's a better option. regards [1] https://www.postgresql.org/message-id/011937a3-7427-b99f-13f1-c07a127cf94c%40enterprisedb.com [2] https://www.postgresql.org/message-id/4006636.1618577893%40sss.pgh.pa.us -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index edba5e49a8..0284162034 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -1716,6 +1716,7 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, List *pathkeys = (List *) lfirst(lcp); List *startup_subpaths = NIL; List *total_subpaths = NIL; + List *fractional_subpaths = NIL; bool startup_neq_total = false; ListCell *lcr; bool match_partition_order; @@ -1745,7 +1746,8 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, { RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr); Path *cheapest_startup, - *cheapest_total; + *cheapest_total, + *cheapest_fractional = NULL; /* Locate the right paths, if they are available. */ cheapest_startup = @@ -1761,6 +1763,21 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, TOTAL_COST, false); + /* + * XXX strange that get_cheapest_fractional_path_for_pathkeys + * does not have require_parallel_safe. + */ + if (root->tuple_fraction > 0) + { + double path_fraction = 1.0 / root->tuple_fraction; + + cheapest_fractional = + get_cheapest_fractional_path_for_pathkeys(childrel->pathlist, + pathkeys, + NULL, + path_fraction); + } + /* * If we can't find any paths with the right order just use the * cheapest-total path; we'll have to sort it later. @@ -1773,6 +1790,18 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, Assert(cheapest_total->param_info == NULL); } + /* + * XXX Do we need to do something about cheapest_fractional? + * It could be NULL if there are no properly sorted paths, + * but then maybe just doing the sort is good enough. + * + * XXX Well, maybe we should not just grab cheapest_total_path + * here, because we might use incremental sort on a path that + * is not fully sorted. + */ + if ((root->tuple_fraction > 0) && !cheapest_fractional) + cheapest_fractional = cheapest_total; + /* * Notice whether we actually have different paths for the * "cheapest" and "total" cases; frequently there will be no point @@ -1799,6 +1828,12 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, startup_subpaths = lappend(startup_subpaths, cheapest_startup); total_subpaths = lappend(total_subpaths, cheapest_total); + + if (cheapest_fractional) + { + cheapest_fractional = get_singleton_append_subpath(cheapest_fractional); + fractional_subpaths = lappend(fractional_subpaths, cheapest_fractional); + } } else if (match_partition_order_desc) { @@ -1812,6 +1847,12 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, startup_subpaths = lcons(cheapest_startup, startup_subpaths); total_subpaths = lcons(cheapest_total, total_subpaths); + + if (cheapest_fractional) + { + cheapest_fractional = get_singleton_append_subpath(cheapest_fractional); + fractional_subpaths = lcons(cheapest_fractional, fractional_subpaths); + } } else { @@ -1823,6 +1864,10 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, &startup_subpaths, NULL); accumulate_append_subpath(cheapest_total, &total_subpaths, NULL); + + if (cheapest_fractional) + accumulate_append_subpath(cheapest_fractional, + &fractional_subpaths, NULL); } } @@ -1849,6 +1894,18 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, 0, false, -1)); + + /* XXX maybe we should have startup_new_fractional? */ + if (fractional_subpaths) + add_path(rel, (Path *) create_append_path(root, + rel, + fractional_subpaths, + NIL, + pathkeys, + NULL, + 0, + false, + -1)); } else { @@ -1864,6 +1921,14 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel, total_subpaths, pathkeys, NULL)); + + /* XXX maybe we should have startup_new_fractional? */ + if (fractional_subpaths) + add_path(rel, (Path *) create_merge_append_path(root, + rel, + fractional_subpaths, + pathkeys, + NULL)); } } }
optimizer_first_rows_simple.sql
Description: application/sql