The upper planner was pathified many years ago now. That was a large chunk of work and because of that, the union planner was not properly pathified in that effort. A small note was left in recurse_set_operations() to mention about future work.
You can see this lack of pathification in make_union_unique() and choose_hashed_setop(). There are heuristics in there to decide the method to use instead of creating paths and letting add_path() decide what's faster. I've been working on improving this for the past few weeks and I'm not quite as far along as I'd hoped, but what I have is perhaps worthy of sharing. For now, I've only improved UNIONs. A UNION plan can now look like: # explain (costs off) select * from a union select * from a; QUERY PLAN --------------------------------------------------- Unique -> Merge Append Sort Key: a.a -> Index Only Scan using a_pkey on a -> Index Only Scan using a_pkey on a a_1 Previously we'd have only considered Append -> Hash Aggregate or via Append -> Sort -> Unique To make this work, the query_planner() needs to know about setops, so I've passed those down via the standard_qp_extra struct so that we can choose pathkeys for the setops. One part that still needs work is the EquivalanceClass building. Because we only build the final targetlist for the Append after planning all the append child queries, I ended up having to populate the EquivalanceClasses backwards, i.e children first. add_eq_member() determines if you're passing a child member by checking if parent != NULL. Since I don't have a parent EquivalenceMember to pass, em_is_child gets set wrongly, and that causes problems because ec_has_const can get set to true when it shouldn't. This is a problem as it can make a PathKey redundant when it isn't. I wonder if I'll need to change the signature of add_eq_member() and add an "is_child" bool to force the EM to be a child em... Needs more thought... I've not worked on the creation of Incremental Sort paths yet, or done any path plumbing work to have UNION consider Gather Merge -> Unique on already sorted paths. I think to make similar improvements to EXCEPT and INTERSECT we'd need a node executor node. Perhaps nodeMergeAppendSetops.c which can be configured to do EXCEPT or INTERSECT. It could also perhaps handle UNION too then we can use that instead of a Merge Append -> Unique. That might save doing some slot copying and improve performance further. I'm not planning on doing that for the first stage. I only intend to improve UNION for that and we have all the executor nodes to make that work already. Anyway, I've attached my WIP patch for this. David
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 7fa502d6e2..54ddcf8285 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -2599,6 +2599,44 @@ find_derived_clause_for_ec_member(EquivalenceClass *ec, return NULL; } +/* + * add_union_rel_equivalences + */ +void +add_union_rel_equivalences(PlannerInfo *root, Relids relids, + List *tlist, List *union_pathkeys) +{ + ListCell *lc; + ListCell *lc2 = list_head(union_pathkeys); + + foreach(lc, tlist) + { + TargetEntry *tle = lfirst_node(TargetEntry, lc); + PathKey *pk; + + if (tle->resjunk) + continue; + + if (lc2 == NULL) + elog(ERROR, "wrong number of union pathkeys"); + + pk = lfirst_node(PathKey, lc2); + + /* + * XXX this needs fixed. The children are added before the parents, + * so we cannot identify the parent. + */ + add_eq_member(pk->pk_eclass, + tle->expr, + relids, + NULL, + NULL, + exprType((Node *) tle->expr)); + pk->pk_eclass->ec_has_const = false; /* XXX hack hack */ + + lc2 = lnext(union_pathkeys, lc2); + } +} /* * add_child_rel_equivalences diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index fdb60aaa8d..00f2071cf0 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -786,6 +786,59 @@ build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel, return retval; } +/* + * build_union_child_pathkeys + */ +List * +build_union_child_pathkeys(PlannerInfo *root, SetOperationStmt *op, + RelOptInfo *rel, List *tlist) +{ + ListCell *lc; + ListCell *sgccell = list_head(op->groupClauses); + List *retval = NIL; + + foreach(lc, tlist) + { + TargetEntry *tle = lfirst_node(TargetEntry, lc); + SortGroupClause *sgc; + + Oid opfamily; + Oid opcintype; + int16 strategy; + PathKey *cpathkey; + + if (tle->resjunk) + continue; + + if (sgccell == NULL) + elog(ERROR, "wrong number of group clauses"); /* XXX write better + * error */ + + sgc = lfirst_node(SortGroupClause, sgccell); + + /* Find the operator in pg_amop --- failure shouldn't happen */ + if (!get_ordering_op_properties(sgc->sortop, + &opfamily, &opcintype, &strategy)) + elog(ERROR, "operator %u is not a valid ordering operator", + sgc->eqop); + + cpathkey = make_pathkey_from_sortinfo(root, + tle->expr, + opfamily, + opcintype, + exprCollation((Node *) tle->expr), + false, + sgc->nulls_first, + 0, + rel->relids, + true); + retval = lappend(retval, cpathkey); + sgccell = lnext(op->groupClauses, sgccell); + } + + return retval; +} + /* * build_expression_pathkey * Build a pathkeys list that describes an ordering by a single expression diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index a8cea5efe1..c5bb4e6c60 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -126,6 +126,8 @@ typedef struct { List *activeWindows; /* active windows, if any */ grouping_sets_data *gset_data; /* grouping sets data, if any */ + SetOperationStmt *setops; /* set operation stmt details or NULL if not a + * set operation */ } standard_qp_extra; /* Local functions */ @@ -1504,6 +1506,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction) /* Set up data needed by standard_qp_callback */ qp_extra.activeWindows = activeWindows; qp_extra.gset_data = gset_data; + if (root->parent_root != NULL && + root->parent_root->parse->setOperations != NULL && + IsA(root->parent_root->parse->setOperations, SetOperationStmt)) + qp_extra.setops = + (SetOperationStmt *) root->parent_root->parse->setOperations; + else + qp_extra.setops = NULL; /* * Generate the best unsorted and presorted paths for the scan/join @@ -3551,6 +3560,32 @@ standard_qp_callback(PlannerInfo *root, void *extra) root->query_pathkeys = root->distinct_pathkeys; else if (root->sort_pathkeys) root->query_pathkeys = root->sort_pathkeys; + else if (qp_extra->setops != NULL && + qp_extra->setops->op == SETOP_UNION && + !qp_extra->setops->all) + { + List *groupClauses; + ListCell *lc; + + foreach(lc, root->processed_tlist) + { + TargetEntry *tle = lfirst_node(TargetEntry, lc); + + if (tle->resjunk) + continue; + + tle->ressortgroupref = tle->resno; + } + + groupClauses = generate_setop_grouplist(qp_extra->setops, tlist); + + if (grouping_is_sortable(groupClauses)) + root->query_pathkeys = make_pathkeys_for_sortclauses(root, + groupClauses, + tlist); + else + root->query_pathkeys = NIL; + } else root->query_pathkeys = NIL; } diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index 0c68ec011b..ac89c70c50 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -47,10 +47,12 @@ static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root, + SetOperationStmt *top_union, List *colTypes, List *colCollations, bool junkOK, int flag, List *refnames_tlist, List **pTargetList, + List **first_child_pathkeys, double *pNumGroups); static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root, @@ -65,9 +67,8 @@ static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *ro static List *plan_union_children(PlannerInfo *root, SetOperationStmt *top_union, List *refnames_tlist, - List **tlist_list); -static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist, - PlannerInfo *root); + List **tlist_list, + List **first_child_pathkeys); static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel); static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses, Path *input_path, @@ -84,7 +85,6 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, bool flag, List *input_tlists, List *refnames_tlist); -static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist); /* @@ -166,11 +166,12 @@ plan_set_operations(PlannerInfo *root) * output from the top-level node, plus possibly resjunk working * columns (we can rely on upper-level nodes to deal with that). */ - setop_rel = recurse_set_operations((Node *) topop, root, + setop_rel = recurse_set_operations((Node *) topop, root, topop, topop->colTypes, topop->colCollations, true, -1, leftmostQuery->targetList, &top_tlist, + NULL, NULL); } @@ -206,10 +207,12 @@ plan_set_operations(PlannerInfo *root) */ static RelOptInfo * recurse_set_operations(Node *setOp, PlannerInfo *root, + SetOperationStmt *topop, List *colTypes, List *colCollations, bool junkOK, int flag, List *refnames_tlist, List **pTargetList, + List **first_child_pathkeys, double *pNumGroups) { RelOptInfo *rel = NULL; /* keep compiler quiet */ @@ -224,10 +227,10 @@ recurse_set_operations(Node *setOp, PlannerInfo *root, Query *subquery = rte->subquery; PlannerInfo *subroot; RelOptInfo *final_rel; - Path *subpath; - Path *path; List *tlist; + List *child_pathkeys; bool trivial_tlist; + ListCell *lc; Assert(subquery != NULL); @@ -263,6 +266,26 @@ recurse_set_operations(Node *setOp, PlannerInfo *root, /* Return the fully-fledged tlist to caller, too */ *pTargetList = tlist; + if (!topop->all && first_child_pathkeys != NULL && + grouping_is_sortable(topop->groupClauses)) + { + if (*first_child_pathkeys == NIL) + { + child_pathkeys = build_union_child_pathkeys(root, + topop, + rel, + tlist); + *first_child_pathkeys = child_pathkeys; + } + else + { + add_union_rel_equivalences(root, + rel->relids, + tlist, + *first_child_pathkeys); + } + } + /* * Mark rel with estimated output rows, width, etc. Note that we have * to do this before generating outer-query paths, else @@ -278,44 +301,61 @@ recurse_set_operations(Node *setOp, PlannerInfo *root, rel->consider_parallel = final_rel->consider_parallel; /* - * For the moment, we consider only a single Path for the subquery. - * This should change soon (make it look more like - * set_subquery_pathlist). - */ - subpath = get_cheapest_fractional_path(final_rel, - root->tuple_fraction); - - /* - * Stick a SubqueryScanPath atop that. - * - * We don't bother to determine the subquery's output ordering since - * it won't be reflected in the set-op result anyhow; so just label - * the SubqueryScanPath with nil pathkeys. (XXX that should change - * soon too, likely.) + * For each Path that subquery_planner produced, make a + * SubqueryScanPath in the outer query. */ - path = (Path *) create_subqueryscan_path(root, rel, subpath, - trivial_tlist, - NIL, NULL); - - add_path(rel, path); + foreach(lc, final_rel->pathlist) + { + Path *subpath = (Path *) lfirst(lc); + List *pathkeys; + + /* Convert subpath's pathkeys to outer representation */ + pathkeys = convert_subquery_pathkeys(root, + rel, + subpath->pathkeys, + make_tlist_from_pathtarget(subpath->pathtarget)); + + /* Generate outer path using this subpath */ + add_path(rel, + (Path *) create_subqueryscan_path(root, + rel, + subpath, + trivial_tlist, + pathkeys, + NULL)); + } - /* - * If we have a partial path for the child relation, we can use that - * to build a partial path for this relation. But there's no point in - * considering any path but the cheapest. - */ - if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) && - final_rel->partial_pathlist != NIL) + /* If outer rel allows parallelism, do same for partial paths. */ + if (rel->consider_parallel && bms_is_empty(rel->lateral_relids)) { - Path *partial_subpath; - Path *partial_path; - - partial_subpath = linitial(final_rel->partial_pathlist); - partial_path = (Path *) - create_subqueryscan_path(root, rel, partial_subpath, - trivial_tlist, - NIL, NULL); - add_partial_path(rel, partial_path); + /* + * If consider_parallel is false, there should be no partial + * paths. + */ + Assert(final_rel->consider_parallel || + final_rel->partial_pathlist == NIL); + + /* Same for partial paths. */ + foreach(lc, final_rel->partial_pathlist) + { + Path *subpath = (Path *) lfirst(lc); + List *pathkeys; + + /* Convert subpath's pathkeys to outer representation */ + pathkeys = convert_subquery_pathkeys(root, + rel, + subpath->pathkeys, + make_tlist_from_pathtarget(subpath->pathtarget)); + + /* Generate outer path using this subpath */ + add_partial_path(rel, (Path *) + create_subqueryscan_path(root, + rel, + subpath, + trivial_tlist, + pathkeys, + NULL)); + } } /* @@ -338,11 +378,11 @@ recurse_set_operations(Node *setOp, PlannerInfo *root, if (subquery->groupClause || subquery->groupingSets || subquery->distinctClause || subroot->hasHavingQual || subquery->hasAggs) - *pNumGroups = subpath->rows; + *pNumGroups = final_rel->rows; else *pNumGroups = estimate_num_groups(subroot, get_tlist_exprs(subquery->targetList, false), - subpath->rows, + final_rel->rows, NULL, NULL); } @@ -464,20 +504,22 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root, * Unlike a regular UNION node, process the left and right inputs * separately without any intention of combining them into one Append. */ - lrel = recurse_set_operations(setOp->larg, root, + lrel = recurse_set_operations(setOp->larg, root, setOp, setOp->colTypes, setOp->colCollations, false, -1, refnames_tlist, &lpath_tlist, + NULL, NULL); lpath = lrel->cheapest_total_path; /* The right path will want to look at the left one ... */ root->non_recursive_path = lpath; - rrel = recurse_set_operations(setOp->rarg, root, + rrel = recurse_set_operations(setOp->rarg, root, setOp, setOp->colTypes, setOp->colCollations, false, -1, refnames_tlist, &rpath_tlist, + NULL, NULL); rpath = rrel->cheapest_total_path; root->non_recursive_path = NULL; @@ -553,13 +595,18 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, double save_fraction = root->tuple_fraction; ListCell *lc; List *pathlist = NIL; + List *ordered_pathlist = NIL; List *partial_pathlist = NIL; bool partial_paths_valid = true; bool consider_parallel = true; List *rellist; List *tlist_list; List *tlist; - Path *path; + List *first_child_pathkeys = NIL; + List *groupList = NIL; + Path *apath; + Path *gpath = NULL; + bool try_sorted; /* * If plain UNION, tell children to fetch all tuples. @@ -581,7 +628,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, * only one Append and unique-ification for the lot. Recurse to find such * nodes and compute their children's paths. */ - rellist = plan_union_children(root, op, refnames_tlist, &tlist_list); + rellist = plan_union_children(root, + op, + refnames_tlist, + &tlist_list, + &first_child_pathkeys); /* * Generate tlist for Append plan node. @@ -593,15 +644,47 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, tlist = generate_append_tlist(op->colTypes, op->colCollations, false, tlist_list, refnames_tlist); + /* For for UNIONs (not UNION ALL), try sorting, if sorting is possible */ + try_sorted = !op->all && grouping_is_sortable(op->groupClauses); + + if (try_sorted) + { + /* add equivalence members for top-level target list */ + add_union_rel_equivalences(root, + bms_make_singleton(0), + tlist, + first_child_pathkeys); + groupList = generate_setop_grouplist(op, tlist); + } + *pTargetList = tlist; /* Build path lists and relid set. */ foreach(lc, rellist) { RelOptInfo *rel = lfirst(lc); + Path *ordered_path; pathlist = lappend(pathlist, rel->cheapest_total_path); + if (try_sorted) + { + ordered_path = get_cheapest_path_for_pathkeys(rel->pathlist, + first_child_pathkeys, + NULL, + TOTAL_COST, + true); + + /* + * XXX should we sort the cheapest path if we can't find a path + * that's correctly ordered? + */ + if (ordered_path != NULL) + ordered_pathlist = lappend(ordered_pathlist, ordered_path); + else + try_sorted = false; + } + if (consider_parallel) { if (!rel->consider_parallel) @@ -627,24 +710,15 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, /* * Append the child results together. */ - path = (Path *) create_append_path(root, result_rel, pathlist, NIL, - NIL, NULL, 0, false, -1); - - /* - * For UNION ALL, we just need the Append path. For UNION, need to add - * node(s) to remove duplicates. - */ - if (!op->all) - path = make_union_unique(op, path, tlist, root); - - add_path(result_rel, path); + apath = (Path *) create_append_path(root, result_rel, pathlist, NIL, + NIL, NULL, 0, false, -1); /* * Estimate number of groups. For now we just assume the output is unique * --- this is certainly true for the UNION case, and we want worst-case * estimates anyway. */ - result_rel->rows = path->rows; + result_rel->rows = apath->rows; /* * Now consider doing the same thing using the partial paths plus Append @@ -652,7 +726,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, */ if (partial_paths_valid) { - Path *ppath; + Path *papath; int parallel_workers = 0; /* Find the highest number of workers requested for any subpath. */ @@ -681,19 +755,135 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, } Assert(parallel_workers > 0); - ppath = (Path *) + papath = (Path *) create_append_path(root, result_rel, NIL, partial_pathlist, - NIL, NULL, - parallel_workers, enable_parallel_append, - -1); - ppath = (Path *) - create_gather_path(root, result_rel, ppath, - result_rel->reltarget, NULL, NULL); - if (!op->all) - ppath = make_union_unique(op, ppath, tlist, root); - add_path(result_rel, ppath); + NIL, NULL, parallel_workers, + enable_parallel_append, -1); + gpath = (Path *) + create_gather_path(root, result_rel, + papath, result_rel->reltarget, NULL, NULL); } + if (!op->all) + { + double dNumGroups; + bool can_sort = grouping_is_sortable(groupList); + bool can_hash = grouping_is_hashable(groupList); + + /* + * XXX for the moment, take the number of distinct groups as equal to + * the total input size, ie, the worst case. This is too + * conservative, but it's not clear how to get a decent estimate of + * the true size. One should note as well the propensity of novices + * to write UNION rather than UNION ALL even when they don't expect + * any duplicates... + */ + dNumGroups = apath->rows; + + if (can_hash) + { + Path *path; + + /* Hashed aggregate plan --- no sort needed */ + path = (Path *) create_agg_path(root, + result_rel, + apath, + create_pathtarget(root, tlist), + AGG_HASHED, + AGGSPLIT_SIMPLE, + groupList, + NIL, + NULL, + dNumGroups); + add_path(result_rel, path); + + /* Try hash aggregate on the Gather path, if valid */ + if (gpath != NULL) + { + /* Hashed aggregate plan --- no sort needed */ + path = (Path *) create_agg_path(root, + result_rel, + gpath, + create_pathtarget(root, tlist), + AGG_HASHED, + AGGSPLIT_SIMPLE, + groupList, + NIL, + NULL, + dNumGroups); + add_path(result_rel, path); + } + } + + if (can_sort) + { + Path *path = apath; + + if (groupList != NIL) + path = (Path *) create_sort_path(root, result_rel, path, + make_pathkeys_for_sortclauses(root, groupList, tlist), + -1.0); + + path = (Path *) create_upper_unique_path(root, + result_rel, + path, + list_length(path->pathkeys), + dNumGroups); + + add_path(result_rel, path); + + /* Try Sort -> Unique on the Gather path, if set */ + if (gpath != NULL) + { + path = gpath; + + path = (Path *) create_sort_path(root, result_rel, path, + make_pathkeys_for_sortclauses(root, groupList, tlist), + -1.0); + + path = (Path *) create_upper_unique_path(root, + result_rel, + path, + list_length(path->pathkeys), + dNumGroups); + add_path(result_rel, path); + } + } + + /* + * Try making a MergeAppend path if we managed to find a path with the + * correct pathkeys for each union child. + */ + if (try_sorted && groupList != NIL) + { + Path *path; + + path = (Path *) create_merge_append_path(root, + result_rel, + ordered_pathlist, + first_child_pathkeys, + NULL); + + /* and make the MergeAppend unique */ + path = (Path *) create_upper_unique_path(root, + result_rel, + path, + list_length(tlist), + path->rows); + + add_path(result_rel, path); + } + } + else + { + /* UNION ALL */ + add_path(result_rel, apath); + + if (gpath != NULL) + add_path(result_rel, gpath); + } + + /* Undo effects of possibly forcing tuple_fraction to 0 */ root->tuple_fraction = save_fraction; @@ -735,18 +925,20 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root, root->tuple_fraction = 0.0; /* Recurse on children, ensuring their outputs are marked */ - lrel = recurse_set_operations(op->larg, root, + lrel = recurse_set_operations(op->larg, root, op, op->colTypes, op->colCollations, false, 0, refnames_tlist, &lpath_tlist, + NULL, &dLeftGroups); lpath = lrel->cheapest_total_path; - rrel = recurse_set_operations(op->rarg, root, + rrel = recurse_set_operations(op->rarg, root, op, op->colTypes, op->colCollations, false, 1, refnames_tlist, &rpath_tlist, + NULL, &dRightGroups); rpath = rrel->cheapest_total_path; @@ -884,7 +1076,8 @@ static List * plan_union_children(PlannerInfo *root, SetOperationStmt *top_union, List *refnames_tlist, - List **tlist_list) + List **tlist_list, + List **first_child_pathkeys) { List *pending_rels = list_make1(top_union); List *result = NIL; @@ -923,12 +1116,13 @@ plan_union_children(PlannerInfo *root, * we have an EXCEPT or INTERSECT as child, else there won't be * resjunk anyway. */ - result = lappend(result, recurse_set_operations(setOp, root, + result = lappend(result, recurse_set_operations(setOp, root, top_union, top_union->colTypes, top_union->colCollations, false, -1, refnames_tlist, &child_tlist, + first_child_pathkeys, NULL)); *tlist_list = lappend(*tlist_list, child_tlist); } @@ -936,68 +1130,6 @@ plan_union_children(PlannerInfo *root, return result; } -/* - * Add nodes to the given path tree to unique-ify the result of a UNION. - */ -static Path * -make_union_unique(SetOperationStmt *op, Path *path, List *tlist, - PlannerInfo *root) -{ - RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL); - List *groupList; - double dNumGroups; - - /* Identify the grouping semantics */ - groupList = generate_setop_grouplist(op, tlist); - - /* - * XXX for the moment, take the number of distinct groups as equal to the - * total input size, ie, the worst case. This is too conservative, but - * it's not clear how to get a decent estimate of the true size. One - * should note as well the propensity of novices to write UNION rather - * than UNION ALL even when they don't expect any duplicates... - */ - dNumGroups = path->rows; - - /* Decide whether to hash or sort */ - if (choose_hashed_setop(root, groupList, path, - dNumGroups, dNumGroups, - "UNION")) - { - /* Hashed aggregate plan --- no sort needed */ - path = (Path *) create_agg_path(root, - result_rel, - path, - create_pathtarget(root, tlist), - AGG_HASHED, - AGGSPLIT_SIMPLE, - groupList, - NIL, - NULL, - dNumGroups); - } - else - { - /* Sort and Unique */ - if (groupList) - path = (Path *) - create_sort_path(root, - result_rel, - path, - make_pathkeys_for_sortclauses(root, - groupList, - tlist), - -1.0); - path = (Path *) create_upper_unique_path(root, - result_rel, - path, - list_length(path->pathkeys), - dNumGroups); - } - - return path; -} - /* * postprocess_setop_rel - perform steps required after adding paths */ @@ -1403,7 +1535,7 @@ generate_append_tlist(List *colTypes, List *colCollations, * setop. So what we need to do here is copy that list and install * proper sortgrouprefs into it (copying those from the targetlist). */ -static List * +List * generate_setop_grouplist(SetOperationStmt *op, List *targetlist) { List *grouplist = copyObject(op->groupClauses); diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 9e7408c7ec..b945c58a35 100644 --- a/src/include/optimizer/paths.h +++ b/src/include/optimizer/paths.h @@ -164,6 +164,9 @@ extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root, int colno); extern RestrictInfo *find_derived_clause_for_ec_member(EquivalenceClass *ec, EquivalenceMember *em); +extern void add_union_rel_equivalences(PlannerInfo *root, Relids relids, + List *tlist, + List *union_pathkeys); extern void add_child_rel_equivalences(PlannerInfo *root, AppendRelInfo *appinfo, RelOptInfo *parent_rel, @@ -217,6 +220,8 @@ extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index, ScanDirection scandir); extern List *build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel, ScanDirection scandir, bool *partialkeys); +extern List *build_union_child_pathkeys(PlannerInfo *root, SetOperationStmt *op, + RelOptInfo *rel, List *tlist); extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr, Oid opno, Relids rel, bool create_it); diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 54fd61c9c3..3b7641f2b6 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -53,6 +53,6 @@ extern void preprocess_aggrefs(PlannerInfo *root, Node *clause); * prototypes for prepunion.c */ extern RelOptInfo *plan_set_operations(PlannerInfo *root); - +extern List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist); #endif /* PREP_H */ diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 64cebe4833..04492c961e 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -370,16 +370,16 @@ select count(*) from explain (costs off) select count(*) from ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss; - QUERY PLAN ------------------------------------------------------------------------------------- + QUERY PLAN +---------------------------------------------------------------------------- Aggregate -> Subquery Scan on ss -> HashSetOp Intersect -> Append - -> Subquery Scan on "*SELECT* 2" - -> Seq Scan on tenk1 -> Subquery Scan on "*SELECT* 1" - -> Index Only Scan using tenk1_unique1 on tenk1 tenk1_1 + -> Index Only Scan using tenk1_unique1 on tenk1 + -> Subquery Scan on "*SELECT* 2" + -> Seq Scan on tenk1 tenk1_1 (8 rows) select count(*) from @@ -433,18 +433,18 @@ select count(*) from explain (costs off) select count(*) from ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss; - QUERY PLAN ------------------------------------------------------------------------------------------- + QUERY PLAN +---------------------------------------------------------------------------------- Aggregate -> Subquery Scan on ss -> SetOp Intersect -> Sort - Sort Key: "*SELECT* 2".fivethous + Sort Key: "*SELECT* 1".unique1 -> Append - -> Subquery Scan on "*SELECT* 2" - -> Seq Scan on tenk1 -> Subquery Scan on "*SELECT* 1" - -> Index Only Scan using tenk1_unique1 on tenk1 tenk1_1 + -> Index Only Scan using tenk1_unique1 on tenk1 + -> Subquery Scan on "*SELECT* 2" + -> Seq Scan on tenk1 tenk1_1 (10 rows) select count(*) from @@ -954,7 +954,7 @@ explain (costs off) select from generate_series(1,5) union select from generate_series(1,3); QUERY PLAN ---------------------------------------------------------------- - HashAggregate + Unique -> Append -> Function Scan on generate_series -> Function Scan on generate_series generate_series_1 @@ -1102,16 +1102,17 @@ explain (costs off) UNION SELECT * FROM t2) t WHERE ab = 'ab'; - QUERY PLAN ---------------------------------------------------- - HashAggregate - Group Key: ((t1.a || t1.b)) - -> Append - -> Index Scan using t1_ab_idx on t1 - Index Cond: ((a || b) = 'ab'::text) - -> Index Only Scan using t2_pkey on t2 - Index Cond: (ab = 'ab'::text) -(7 rows) + QUERY PLAN +--------------------------------------------------------- + Unique + -> Sort + Sort Key: ((t1.a || t1.b)) + -> Append + -> Index Scan using t1_ab_idx on t1 + Index Cond: ((a || b) = 'ab'::text) + -> Index Only Scan using t2_pkey on t2 + Index Cond: (ab = 'ab'::text) +(8 rows) -- -- Test that ORDER BY for UNION ALL can be pushed down to inheritance