On Tue, Jul 09, 2019 at 03:37:03AM +0200, Tomas Vondra wrote:
... Notice that cost of the second plan is almost double the first one. That means 0004 does not even generate the first plan, i.e. there are cases where we don't try to add the explicit sort before passing the path to generate_gather_paths(). And I think I know why is that - while gather_grouping_paths() tries to add explicit sort below the gather merge, there are other places that call generate_gather_paths() that don't do that. In this case it's probably apply_scanjoin_target_to_paths() which simply builds parallel (seq|index) scan + gather merge and that's it. The problem is likely the same - the code does not know which pathkeys are "interesting" at that point. We probably need to teach planner to do this.
I've looked into this again, and yes - that's the reason. I've added generate_useful_gather_paths() which is a wrapper on top of generate_gather_paths(). It essentially does what 0001 patch did directly in generate_gather_paths() so it's more like create_grouping_paths(). And that does the trick - we now generate the cheaper paths, and I don't see any crashes in regression tests etc. I still suspect we already have code doing similar checks whether pathkeys might be useful somewhere. I've looked into pathkeys.c and elsewhere but no luck. Attached is a slightly modified patch series: 1) 0001 considers incremental sort paths in various places (adds the new generate_useful_gather_paths and modifies places calling create_sort_path) 2) 0002 and 0003 are fixes I mentioned before 3) 0004 adds a new GUC force_incremental_sort that (when set to 'on') tries to nudge the optimizer into using incremental sort by essentially making it free (i.e. using startup/total costs of the subpath). I've found this useful when trying to force incremental sorts into plans where it may not be the best strategy. I won't have time to hack on this over the next ~2 weeks, but I'll try to respond to questions when possible.
FWIW tweaking all the create_sort_path() places to also consider adding incremental sort is a bit tedious and invasive, and it almost doubles the amount of repetitive code. It's OK for experiment like this, but we should try handling this in a nicer way (move to a separate function that does both, or something like that).
This definitely needs more work. We need to refactor it in some way, e.g. have a function that would consider both explicit sort (on the cheapest path) and incremental sort (on all paths), and call it from all those places. Haven't tried it, though. There's also a couple more places where we do create_sort_path() and don't consider incremental sort yet - window functions, distinct etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From e7e97daf447a91e090809be4f07a5eee650eb5e7 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant.com> Date: Tue, 9 Jul 2019 00:12:45 +0200 Subject: [PATCH 1/4] fix pathkey processing in generate_gather_paths --- src/backend/optimizer/path/allpaths.c | 365 +++++++++++++++++++++++- src/backend/optimizer/plan/createplan.c | 10 +- src/backend/optimizer/plan/planner.c | 301 ++++++++++++++++++- src/include/optimizer/paths.h | 2 + 4 files changed, 673 insertions(+), 5 deletions(-) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 3efc807164..acddbef064 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -556,7 +556,7 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, */ if (rel->reloptkind == RELOPT_BASEREL && bms_membership(root->all_baserels) != BMS_SINGLETON) - generate_gather_paths(root, rel, false); + generate_useful_gather_paths(root, rel, false); /* Now find the cheapest of the paths for this rel */ set_cheapest(rel); @@ -2730,6 +2730,367 @@ generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_rows) } } +/* + * Find an equivalence class member expression, all of whose Vars, come from + * the indicated relation. + */ +static Expr * +find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel) +{ + ListCell *lc_em; + + foreach(lc_em, ec->ec_members) + { + EquivalenceMember *em = lfirst(lc_em); + + if (bms_is_subset(em->em_relids, rel->relids) && + !bms_is_empty(em->em_relids)) + { + /* + * If there is more than one equivalence member whose Vars are + * taken entirely from this relation, we'll be content to choose + * any one of those. + */ + return em->em_expr; + } + } + + /* We didn't find any suitable equivalence class expression */ + return NULL; +} + +/* + * get_useful_ecs_for_relation + * Determine which EquivalenceClasses might be involved in useful + * orderings of this relation. + * + * This function is in some respects a mirror image of the core function + * pathkeys_useful_for_merging: for a regular table, we know what indexes + * we have and want to test whether any of them are useful. For a foreign + * table, we don't know what indexes are present on the remote side but + * want to speculate about which ones we'd like to use if they existed. + * + * This function returns a list of potentially-useful equivalence classes, + * but it does not guarantee that an EquivalenceMember exists which contains + * Vars only from the given relation. For example, given ft1 JOIN t1 ON + * ft1.x + t1.x = 0, this function will say that the equivalence class + * containing ft1.x + t1.x is potentially useful. Supposing ft1 is remote and + * t1 is local (or on a different server), it will turn out that no useful + * ORDER BY clause can be generated. It's not our job to figure that out + * here; we're only interested in identifying relevant ECs. + */ +static List * +get_useful_ecs_for_relation(PlannerInfo *root, RelOptInfo *rel) +{ + List *useful_eclass_list = NIL; + ListCell *lc; + Relids relids; + + /* + * First, consider whether any active EC is potentially useful for a merge + * join against this relation. + */ + if (rel->has_eclass_joins) + { + foreach(lc, root->eq_classes) + { + EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc); + + if (eclass_useful_for_merging(root, cur_ec, rel)) + useful_eclass_list = lappend(useful_eclass_list, cur_ec); + } + } + + /* + * Next, consider whether there are any non-EC derivable join clauses that + * are merge-joinable. If the joininfo list is empty, we can exit + * quickly. + */ + if (rel->joininfo == NIL) + return useful_eclass_list; + + /* If this is a child rel, we must use the topmost parent rel to search. */ + if (IS_OTHER_REL(rel)) + { + Assert(!bms_is_empty(rel->top_parent_relids)); + relids = rel->top_parent_relids; + } + else + relids = rel->relids; + + /* Check each join clause in turn. */ + foreach(lc, rel->joininfo) + { + RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(lc); + + /* Consider only mergejoinable clauses */ + if (restrictinfo->mergeopfamilies == NIL) + continue; + + /* Make sure we've got canonical ECs. */ + update_mergeclause_eclasses(root, restrictinfo); + + /* + * restrictinfo->mergeopfamilies != NIL is sufficient to guarantee + * that left_ec and right_ec will be initialized, per comments in + * distribute_qual_to_rels. + * + * We want to identify which side of this merge-joinable clause + * contains columns from the relation produced by this RelOptInfo. We + * test for overlap, not containment, because there could be extra + * relations on either side. For example, suppose we've got something + * like ((A JOIN B ON A.x = B.x) JOIN C ON A.y = C.y) LEFT JOIN D ON + * A.y = D.y. The input rel might be the joinrel between A and B, and + * we'll consider the join clause A.y = D.y. relids contains a + * relation not involved in the join class (B) and the equivalence + * class for the left-hand side of the clause contains a relation not + * involved in the input rel (C). Despite the fact that we have only + * overlap and not containment in either direction, A.y is potentially + * useful as a sort column. + * + * Note that it's even possible that relids overlaps neither side of + * the join clause. For example, consider A LEFT JOIN B ON A.x = B.x + * AND A.x = 1. The clause A.x = 1 will appear in B's joininfo list, + * but overlaps neither side of B. In that case, we just skip this + * join clause, since it doesn't suggest a useful sort order for this + * relation. + */ + if (bms_overlap(relids, restrictinfo->right_ec->ec_relids)) + useful_eclass_list = list_append_unique_ptr(useful_eclass_list, + restrictinfo->right_ec); + else if (bms_overlap(relids, restrictinfo->left_ec->ec_relids)) + useful_eclass_list = list_append_unique_ptr(useful_eclass_list, + restrictinfo->left_ec); + } + + return useful_eclass_list; +} + +/* + * get_useful_pathkeys_for_relation + * Determine which orderings of a relation might be useful. + * + * Getting data in sorted order can be useful either because the requested + * order matches the final output ordering for the overall query we're + * planning, or because it enables an efficient merge join. Here, we try + * to figure out which pathkeys to consider. + */ +static List * +get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel) +{ + List *useful_pathkeys_list = NIL; + List *useful_eclass_list; + EquivalenceClass *query_ec = NULL; + ListCell *lc; + + /* + * Pushing the query_pathkeys to the remote server is always worth + * considering, because it might let us avoid a local sort. + */ + if (root->query_pathkeys) + { + bool query_pathkeys_ok = true; + + foreach(lc, root->query_pathkeys) + { + PathKey *pathkey = (PathKey *) lfirst(lc); + EquivalenceClass *pathkey_ec = pathkey->pk_eclass; + Expr *em_expr; + + /* + * The planner and executor don't have any clever strategy for + * taking data sorted by a prefix of the query's pathkeys and + * getting it to be sorted by all of those pathkeys. We'll just + * end up resorting the entire data set. So, unless we can push + * down all of the query pathkeys, forget it. + * + * is_foreign_expr would detect volatile expressions as well, but + * checking ec_has_volatile here saves some cycles. + */ + if (pathkey_ec->ec_has_volatile || + !(em_expr = find_em_expr_for_rel(pathkey_ec, rel))) + { + query_pathkeys_ok = false; + break; + } + } + + if (query_pathkeys_ok) + useful_pathkeys_list = list_make1(list_copy(root->query_pathkeys)); + } + + /* Get the list of interesting EquivalenceClasses. */ + useful_eclass_list = get_useful_ecs_for_relation(root, rel); + + /* Extract unique EC for query, if any, so we don't consider it again. */ + if (list_length(root->query_pathkeys) == 1) + { + PathKey *query_pathkey = linitial(root->query_pathkeys); + + query_ec = query_pathkey->pk_eclass; + } + + /* + * As a heuristic, the only pathkeys we consider here are those of length + * one. It's surely possible to consider more, but since each one we + * choose to consider will generate a round-trip to the remote side, we + * need to be a bit cautious here. It would sure be nice to have a local + * cache of information about remote index definitions... + */ + foreach(lc, useful_eclass_list) + { + EquivalenceClass *cur_ec = lfirst(lc); + Expr *em_expr; + PathKey *pathkey; + + /* If redundant with what we did above, skip it. */ + if (cur_ec == query_ec) + continue; + + /* If no pushable expression for this rel, skip it. */ + em_expr = find_em_expr_for_rel(cur_ec, rel); + if (em_expr == NULL) + continue; + + /* Looks like we can generate a pathkey, so let's do it. */ + pathkey = make_canonical_pathkey(root, cur_ec, + linitial_oid(cur_ec->ec_opfamilies), + BTLessStrategyNumber, + false); + useful_pathkeys_list = lappend(useful_pathkeys_list, + list_make1(pathkey)); + } + + return useful_pathkeys_list; +} + +/* + * generate_useful_gather_paths + * Generate parallel access paths for a relation by pushing a Gather or + * Gather Merge on top of a partial path. + * + * Unlike generate_gather_paths, this does not look just as pathkeys of the + * input paths (aiming to preserve the ordering). It also considers ordering + * that might be useful by nodes above the gather merge node, and tries to + * add a sort (regular or incremental) to provide that. + */ +void +generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_rows) +{ + ListCell *lc; + double rows; + double *rowsp = NULL; + List *useful_pathkeys_list = NIL; + Path *cheapest_partial_path = NULL; + + /* If there are no partial paths, there's nothing to do here. */ + if (rel->partial_pathlist == NIL) + return; + + /* Should we override the rel's rowcount estimate? */ + if (override_rows) + rowsp = &rows; + + /* generate the regular gather merge paths */ + generate_gather_paths(root, rel, override_rows); + + /* consider incremental sort for interesting orderings */ + useful_pathkeys_list = get_useful_pathkeys_for_relation(root, rel); + + /* used for explicit sort paths */ + cheapest_partial_path = linitial(rel->partial_pathlist); + + /* + * Consider incremental sort paths for each interesting ordering. + * + * XXX I wonder if we need to consider adding a projection here, as + * create_ordered_paths does. + */ + foreach(lc, useful_pathkeys_list) + { + List *useful_pathkeys = lfirst(lc); + ListCell *lc2; + bool is_sorted; + int presorted_keys; + + foreach(lc2, rel->partial_pathlist) + { + Path *subpath = (Path *) lfirst(lc2); + GatherMergePath *path; + + /* path has no ordering at all, can't use incremental sort */ + if (subpath->pathkeys == NIL) + continue; + + is_sorted = pathkeys_common_contained_in(useful_pathkeys, + subpath->pathkeys, + &presorted_keys); + + if (is_sorted) + { + path = create_gather_merge_path(root, rel, subpath, rel->reltarget, + subpath->pathkeys, NULL, rowsp); + + add_path(rel, &path->path); + continue; + } + + /* now we know is_sorted == false */ + + /* + * consider regular sort for cheapest partial path (for each + * useful pathkeys) + */ + if (cheapest_partial_path == subpath) + { + Path *tmp; + + tmp = (Path *) create_sort_path(root, + rel, + subpath, + useful_pathkeys, + -1.0); + + rows = tmp->rows * tmp->parallel_workers; + + path = create_gather_merge_path(root, rel, + tmp, + rel->reltarget, + tmp->pathkeys, + NULL, + rowsp); + + add_path(rel, &path->path); + + /* continue */ + } + + /* finally, consider incremental sort */ + if (presorted_keys > 0) + { + Path *tmp; + + /* Also consider incremental sort. */ + tmp = (Path *) create_incremental_sort_path(root, + rel, + subpath, + useful_pathkeys, + presorted_keys, + -1); + + path = create_gather_merge_path(root, rel, + tmp, + rel->reltarget, + tmp->pathkeys, + NULL, + rowsp); + + add_path(rel, &path->path); + } + } + } +} + /* * make_rel_from_joinlist * Build access paths using a "joinlist" to guide the join path search. @@ -2902,7 +3263,7 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels) * once we know the final targetlist (see grouping_planner). */ if (lev < levels_needed) - generate_gather_paths(root, rel, false); + generate_useful_gather_paths(root, rel, false); /* Find and save the cheapest paths for this rel */ set_cheapest(rel); diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index bfb52f21ab..c2877942cb 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -5932,7 +5932,10 @@ prepare_sort_from_pathkeys(Plan *lefttree, List *pathkeys, } } if (!j) - elog(ERROR, "could not find pathkey item to sort"); + { + elog(WARNING, "could not find pathkey item to sort"); + Assert(false); + } /* * Do we need to insert a Result node? @@ -6491,7 +6494,10 @@ make_unique_from_pathkeys(Plan *lefttree, List *pathkeys, int numCols) } if (!tle) - elog(ERROR, "could not find pathkey item to sort"); + { + elog(WARNING, "could not find pathkey item to sort"); + Assert(false); + } /* * Look up the correct equality operator from the PathKey's slightly diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 16996b1bc2..0939f2f7b9 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -5068,6 +5068,48 @@ create_ordered_paths(PlannerInfo *root, add_path(ordered_rel, path); } + + /* also consider incremental sorts on all partial paths */ + { + ListCell *lc; + foreach (lc, input_rel->partial_pathlist) + { + Path *input_path = (Path *) lfirst(lc); + Path *sorted_path = input_path; + bool is_sorted; + int presorted_keys; + + /* already handled above */ + if (input_path == cheapest_partial_path) + continue; + + is_sorted = pathkeys_common_contained_in(root->sort_pathkeys, + input_path->pathkeys, &presorted_keys); + + /* also ignore already sorted paths */ + if (is_sorted) + continue; + + if (presorted_keys > 0) + { + /* Also consider incremental sort. */ + sorted_path = (Path *) create_incremental_sort_path(root, + ordered_rel, + input_path, + root->sort_pathkeys, + presorted_keys, + limit_tuples); + + /* Add projection step if needed */ + if (sorted_path->pathtarget != target) + sorted_path = apply_projection_to_path(root, ordered_rel, + sorted_path, target); + + add_path(ordered_rel, sorted_path); + } + } + + } } /* @@ -6484,6 +6526,80 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, } } + + /* + * Use any available suitably-sorted path as input, with incremental + * sort path. + */ + foreach(lc, input_rel->pathlist) + { + Path *path = (Path *) lfirst(lc); + bool is_sorted; + int presorted_keys; + + is_sorted = pathkeys_common_contained_in(root->group_pathkeys, + path->pathkeys, + &presorted_keys); + + if (is_sorted) + continue; + + if (presorted_keys == 0) + continue; + + path = (Path *) create_incremental_sort_path(root, + grouped_rel, + path, + root->group_pathkeys, + presorted_keys, + -1.0); + + /* Now decide what to stick atop it */ + if (parse->groupingSets) + { + consider_groupingsets_paths(root, grouped_rel, + path, true, can_hash, + gd, agg_costs, dNumGroups); + } + else if (parse->hasAggs) + { + /* + * We have aggregation, possibly with plain GROUP BY. Make + * an AggPath. + */ + add_path(grouped_rel, (Path *) + create_agg_path(root, + grouped_rel, + path, + grouped_rel->reltarget, + parse->groupClause ? AGG_SORTED : AGG_PLAIN, + AGGSPLIT_SIMPLE, + parse->groupClause, + havingQual, + agg_costs, + dNumGroups)); + } + else if (parse->groupClause) + { + /* + * We have GROUP BY without aggregation or grouping sets. + * Make a GroupPath. + */ + add_path(grouped_rel, (Path *) + create_group_path(root, + grouped_rel, + path, + parse->groupClause, + havingQual, + dNumGroups)); + } + else + { + /* Other cases should have been handled above */ + Assert(false); + } + } + /* * Instead of operating directly on the input relation, we can * consider finalizing a partially aggregated path. @@ -6530,6 +6646,53 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, havingQual, dNumGroups)); } + + /* incremental sort */ + foreach(lc, partially_grouped_rel->pathlist) + { + Path *path = (Path *) lfirst(lc); + bool is_sorted; + int presorted_keys; + + is_sorted = pathkeys_common_contained_in(root->group_pathkeys, + path->pathkeys, + &presorted_keys); + + if (is_sorted) + continue; + + if (presorted_keys == 0) + continue; + + path = (Path *) create_incremental_sort_path(root, + grouped_rel, + path, + root->group_pathkeys, + presorted_keys, + -1.0); + + if (parse->hasAggs) + add_path(grouped_rel, (Path *) + create_agg_path(root, + grouped_rel, + path, + grouped_rel->reltarget, + parse->groupClause ? AGG_SORTED : AGG_PLAIN, + AGGSPLIT_FINAL_DESERIAL, + parse->groupClause, + havingQual, + agg_final_costs, + dNumGroups)); + else + add_path(grouped_rel, (Path *) + create_group_path(root, + grouped_rel, + path, + parse->groupClause, + havingQual, + dNumGroups)); + } + } } @@ -6798,6 +6961,57 @@ create_partial_grouping_paths(PlannerInfo *root, dNumPartialGroups)); } } + + /* + * Use any available suitably-sorted path as input, and also consider + * sorting the cheapest partial path. + */ + foreach(lc, input_rel->pathlist) + { + Path *path = (Path *) lfirst(lc); + bool is_sorted; + int presorted_keys; + + is_sorted = pathkeys_common_contained_in(root->group_pathkeys, + path->pathkeys, + &presorted_keys); + + /* also ignore already sorted paths */ + if (is_sorted) + continue; + + if (presorted_keys == 0) + continue; + + /* add incremental sort */ + path = (Path *) create_incremental_sort_path(root, + partially_grouped_rel, + path, + root->group_pathkeys, + presorted_keys, + -1.0); + + if (parse->hasAggs) + add_path(partially_grouped_rel, (Path *) + create_agg_path(root, + partially_grouped_rel, + path, + partially_grouped_rel->reltarget, + parse->groupClause ? AGG_SORTED : AGG_PLAIN, + AGGSPLIT_INITIAL_SERIAL, + parse->groupClause, + NIL, + agg_partial_costs, + dNumPartialGroups)); + else + add_path(partially_grouped_rel, (Path *) + create_group_path(root, + partially_grouped_rel, + path, + parse->groupClause, + NIL, + dNumPartialGroups)); + } } if (can_sort && cheapest_partial_path != NULL) @@ -6842,6 +7056,52 @@ create_partial_grouping_paths(PlannerInfo *root, dNumPartialPartialGroups)); } } + + /* consider incremental sort */ + foreach(lc, input_rel->partial_pathlist) + { + Path *path = (Path *) lfirst(lc); + bool is_sorted; + int presorted_keys; + + is_sorted = pathkeys_common_contained_in(root->group_pathkeys, + path->pathkeys, + &presorted_keys); + + if (is_sorted) + continue; + + if (presorted_keys == 0) + continue; + + path = (Path *) create_incremental_sort_path(root, + partially_grouped_rel, + path, + root->group_pathkeys, + presorted_keys, + -1.0); + + if (parse->hasAggs) + add_partial_path(partially_grouped_rel, (Path *) + create_agg_path(root, + partially_grouped_rel, + path, + partially_grouped_rel->reltarget, + parse->groupClause ? AGG_SORTED : AGG_PLAIN, + AGGSPLIT_INITIAL_SERIAL, + parse->groupClause, + NIL, + agg_partial_costs, + dNumPartialPartialGroups)); + else + add_partial_path(partially_grouped_rel, (Path *) + create_group_path(root, + partially_grouped_rel, + path, + parse->groupClause, + NIL, + dNumPartialPartialGroups)); + } } if (can_hash && cheapest_total_path != NULL) @@ -6938,6 +7198,7 @@ create_partial_grouping_paths(PlannerInfo *root, static void gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel) { + ListCell *lc; Path *cheapest_partial_path; /* Try Gather for unordered paths and Gather Merge for ordered ones. */ @@ -6967,6 +7228,44 @@ gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel) add_path(rel, path); } + + /* also consider incremental sort on all partial paths */ + foreach (lc, rel->partial_pathlist) + { + Path *path = (Path *) lfirst(lc); + bool is_sorted; + int presorted_keys; + double total_groups; + + is_sorted = pathkeys_common_contained_in(root->group_pathkeys, + path->pathkeys, + &presorted_keys); + + if (is_sorted) + continue; + + if (presorted_keys == 0) + continue; + + path = (Path *) create_incremental_sort_path(root, + rel, + path, + root->group_pathkeys, + presorted_keys, + -1.0); + + path = (Path *) + create_gather_merge_path(root, + rel, + path, + rel->reltarget, + root->group_pathkeys, + NULL, + &total_groups); + + add_path(rel, path); + } + } /* @@ -7222,7 +7521,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root, * one of the generated paths may turn out to be the cheapest one. */ if (rel->consider_parallel && !IS_OTHER_REL(rel)) - generate_gather_paths(root, rel, false); + generate_useful_gather_paths(root, rel, false); /* * Reassess which paths are the cheapest, now that we've potentially added diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index e7a40cec3f..20fa94281b 100644 --- a/src/include/optimizer/paths.h +++ b/src/include/optimizer/paths.h @@ -54,6 +54,8 @@ extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed, extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_rows); +extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, + bool override_rows); extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages, int max_workers); extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel, -- 2.21.0
>From 3a7bcb8940911029ba5927e9993a93f86139f7f4 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant.com> Date: Tue, 9 Jul 2019 00:13:04 +0200 Subject: [PATCH 2/4] fix costing in cost_incremental_sort --- src/backend/optimizer/path/costsize.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 7f820e7351..c6aa17ba67 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -1875,16 +1875,8 @@ cost_incremental_sort(Path *path, limit_tuples); /* If we have a LIMIT, adjust the number of groups we'll have to return. */ - if (limit_tuples > 0 && limit_tuples < input_tuples) - { - output_tuples = limit_tuples; - output_groups = floor(output_tuples / group_tuples) + 1; - } - else - { - output_tuples = input_tuples; - output_groups = input_groups; - } + output_tuples = input_tuples; + output_groups = input_groups; /* * Startup cost of incremental sort is the startup cost of its first group -- 2.21.0
>From 6a439153321e40777d791b70424424a501485408 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant.com> Date: Tue, 9 Jul 2019 00:13:34 +0200 Subject: [PATCH 3/4] fix explain in parallel mode --- src/backend/commands/explain.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index d3f855a12a..925e8236ba 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -2775,7 +2775,7 @@ show_incremental_sort_info(IncrementalSortState *incrsortstate, fullsort_spaceUsed, fullsort_group_count); if (prefixsort_instrument) appendStringInfo(es->str, - ", Prefix Sort Method: %s %s: %ldkB Groups: %ld", + ", Prefix Sort Method: %s %s: %ldkB Groups: %ld\n", prefixsort_sortMethod, prefixsort_spaceType, prefixsort_spaceUsed, prefixsort_group_count); else -- 2.21.0
>From 9dc5530429083195e90129b3eb18f8d7a5f78451 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant.com> Date: Tue, 9 Jul 2019 13:36:42 +0200 Subject: [PATCH 4/4] add force_incremental_sort GUC --- src/backend/optimizer/path/costsize.c | 8 ++++++++ src/backend/utils/misc/guc.c | 10 ++++++++++ src/include/optimizer/cost.h | 1 + 3 files changed, 19 insertions(+) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index c6aa17ba67..ee4487b158 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -140,6 +140,8 @@ bool enable_parallel_append = true; bool enable_parallel_hash = true; bool enable_partition_pruning = true; +bool force_incremental_sort = true; + typedef struct { PlannerInfo *root; @@ -1907,6 +1909,12 @@ cost_incremental_sort(Path *path, path->rows = input_tuples; path->startup_cost = startup_cost; path->total_cost = startup_cost + run_cost; + + if (force_incremental_sort) + { + path->startup_cost = input_startup_cost; + path->total_cost = input_total_cost; + } } /* diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index f922ee66a0..d2cc5b56b5 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -941,6 +941,16 @@ static struct config_bool ConfigureNamesBool[] = true, NULL, NULL, NULL }, + { + {"force_incremental_sort", PGC_USERSET, QUERY_TUNING_METHOD, + gettext_noop("Makes the incremental sort look like no-cost."), + NULL, + GUC_EXPLAIN + }, + &force_incremental_sort, + false, + NULL, NULL, NULL + }, { {"enable_incrementalsort", PGC_USERSET, QUERY_TUNING_METHOD, gettext_noop("Enables the planner's use of incremental sort steps."), diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index b9d7a77e65..bad1d5a330 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -66,6 +66,7 @@ extern PGDLLIMPORT bool enable_parallel_append; extern PGDLLIMPORT bool enable_parallel_hash; extern PGDLLIMPORT bool enable_partition_pruning; extern PGDLLIMPORT int constraint_exclusion; +extern PGDLLIMPORT bool force_incremental_sort; extern double index_pages_fetched(double tuples_fetched, BlockNumber pages, double index_pages, PlannerInfo *root); -- 2.21.0