The upper planner was pathified many years ago now.  That was a large
chunk of work and because of that, the union planner was not properly
pathified in that effort.  A small note was left in
recurse_set_operations() to mention about future work.

You can see this lack of pathification in make_union_unique() and
choose_hashed_setop(). There are heuristics in there to decide the
method to use instead of creating paths and letting add_path() decide
what's faster.

I've been working on improving this for the past few weeks and I'm not
quite as far along as I'd hoped, but what I have is perhaps worthy of
sharing.  For now, I've only improved UNIONs.

A UNION plan can now look like:

# explain (costs off) select * from a union select * from a;
                    QUERY PLAN
---------------------------------------------------
 Unique
   ->  Merge Append
         Sort Key: a.a
         ->  Index Only Scan using a_pkey on a
         ->  Index Only Scan using a_pkey on a a_1

Previously we'd have only considered Append -> Hash Aggregate or via
Append -> Sort -> Unique

To make this work, the query_planner() needs to know about setops, so
I've passed those down via the standard_qp_extra struct so that we can
choose pathkeys for the setops.

One part that still needs work is the EquivalanceClass building.
Because we only build the final targetlist for the Append after
planning all the append child queries, I ended up having to populate
the EquivalanceClasses backwards, i.e children first. add_eq_member()
determines if you're passing a child member by checking if parent !=
NULL.  Since I don't have a parent EquivalenceMember to pass,
em_is_child gets set wrongly, and that causes problems because
ec_has_const can get set to true when it shouldn't. This is a problem
as it can make a PathKey redundant when it isn't.  I wonder if I'll
need to change the signature of add_eq_member() and add an "is_child"
bool to force the EM to be a child em... Needs more thought...

I've not worked on the creation of Incremental Sort paths yet, or done
any path plumbing work to have UNION consider Gather Merge -> Unique
on already sorted paths.  I think to make similar improvements to
EXCEPT and INTERSECT we'd need a node executor node. Perhaps
nodeMergeAppendSetops.c which can be configured to do EXCEPT or
INTERSECT.  It could also perhaps handle UNION too then we can use
that instead of a Merge Append -> Unique.  That might save doing some
slot copying and improve performance further. I'm not planning on
doing that for the first stage. I only intend to improve UNION for
that and we have all the executor nodes to make that work already.

Anyway, I've attached my WIP patch for this.

David
diff --git a/src/backend/optimizer/path/equivclass.c 
b/src/backend/optimizer/path/equivclass.c
index 7fa502d6e2..54ddcf8285 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2599,6 +2599,44 @@ find_derived_clause_for_ec_member(EquivalenceClass *ec,
        return NULL;
 }
 
+/*
+ * add_union_rel_equivalences
+ */
+void
+add_union_rel_equivalences(PlannerInfo *root, Relids relids,
+                                                  List *tlist, List 
*union_pathkeys)
+{
+       ListCell   *lc;
+       ListCell   *lc2 = list_head(union_pathkeys);
+
+       foreach(lc, tlist)
+       {
+               TargetEntry *tle = lfirst_node(TargetEntry, lc);
+               PathKey    *pk;
+
+               if (tle->resjunk)
+                       continue;
+
+               if (lc2 == NULL)
+                       elog(ERROR, "wrong number of union pathkeys");
+
+               pk = lfirst_node(PathKey, lc2);
+
+               /*
+                * XXX this needs fixed.  The children are added before the 
parents,
+                * so we cannot identify the parent.
+                */
+               add_eq_member(pk->pk_eclass,
+                                         tle->expr,
+                                         relids,
+                                         NULL,
+                                         NULL,
+                                         exprType((Node *) tle->expr));
+               pk->pk_eclass->ec_has_const = false;    /* XXX hack hack */
+
+               lc2 = lnext(union_pathkeys, lc2);
+       }
+}
 
 /*
  * add_child_rel_equivalences
diff --git a/src/backend/optimizer/path/pathkeys.c 
b/src/backend/optimizer/path/pathkeys.c
index fdb60aaa8d..00f2071cf0 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -786,6 +786,59 @@ build_partition_pathkeys(PlannerInfo *root, RelOptInfo 
*partrel,
        return retval;
 }
 
+/*
+ * build_union_child_pathkeys
+ */
+List *
+build_union_child_pathkeys(PlannerInfo *root, SetOperationStmt *op,
+                                                  RelOptInfo *rel, List *tlist)
+{
+       ListCell   *lc;
+       ListCell   *sgccell = list_head(op->groupClauses);
+       List       *retval = NIL;
+
+       foreach(lc, tlist)
+       {
+               TargetEntry *tle = lfirst_node(TargetEntry, lc);
+               SortGroupClause *sgc;
+
+               Oid                     opfamily;
+               Oid                     opcintype;
+               int16           strategy;
+               PathKey    *cpathkey;
+
+               if (tle->resjunk)
+                       continue;
+
+               if (sgccell == NULL)
+                       elog(ERROR, "wrong number of group clauses");   /* XXX 
write better
+                                                                               
                                         * error */
+
+               sgc = lfirst_node(SortGroupClause, sgccell);
+
+               /* Find the operator in pg_amop --- failure shouldn't happen */
+               if (!get_ordering_op_properties(sgc->sortop,
+                                                                               
&opfamily, &opcintype, &strategy))
+                       elog(ERROR, "operator %u is not a valid ordering 
operator",
+                                sgc->eqop);
+
+               cpathkey = make_pathkey_from_sortinfo(root,
+                                                                               
          tle->expr,
+                                                                               
          opfamily,
+                                                                               
          opcintype,
+                                                                               
          exprCollation((Node *) tle->expr),
+                                                                               
          false,
+                                                                               
          sgc->nulls_first,
+                                                                               
          0,
+                                                                               
          rel->relids,
+                                                                               
          true);
+               retval = lappend(retval, cpathkey);
+               sgccell = lnext(op->groupClauses, sgccell);
+       }
+
+       return retval;
+}
+
 /*
  * build_expression_pathkey
  *       Build a pathkeys list that describes an ordering by a single 
expression
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index a8cea5efe1..c5bb4e6c60 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -126,6 +126,8 @@ typedef struct
 {
        List       *activeWindows;      /* active windows, if any */
        grouping_sets_data *gset_data;  /* grouping sets data, if any */
+       SetOperationStmt *setops;       /* set operation stmt details or NULL 
if not a
+                                                                * set 
operation */
 } standard_qp_extra;
 
 /* Local functions */
@@ -1504,6 +1506,13 @@ grouping_planner(PlannerInfo *root, double 
tuple_fraction)
                /* Set up data needed by standard_qp_callback */
                qp_extra.activeWindows = activeWindows;
                qp_extra.gset_data = gset_data;
+               if (root->parent_root != NULL &&
+                       root->parent_root->parse->setOperations != NULL &&
+                       IsA(root->parent_root->parse->setOperations, 
SetOperationStmt))
+                       qp_extra.setops =
+                               (SetOperationStmt *) 
root->parent_root->parse->setOperations;
+               else
+                       qp_extra.setops = NULL;
 
                /*
                 * Generate the best unsorted and presorted paths for the 
scan/join
@@ -3551,6 +3560,32 @@ standard_qp_callback(PlannerInfo *root, void *extra)
                root->query_pathkeys = root->distinct_pathkeys;
        else if (root->sort_pathkeys)
                root->query_pathkeys = root->sort_pathkeys;
+       else if (qp_extra->setops != NULL &&
+                        qp_extra->setops->op == SETOP_UNION &&
+                        !qp_extra->setops->all)
+       {
+               List       *groupClauses;
+               ListCell   *lc;
+
+               foreach(lc, root->processed_tlist)
+               {
+                       TargetEntry *tle = lfirst_node(TargetEntry, lc);
+
+                       if (tle->resjunk)
+                               continue;
+
+                       tle->ressortgroupref = tle->resno;
+               }
+
+               groupClauses = generate_setop_grouplist(qp_extra->setops, 
tlist);
+
+               if (grouping_is_sortable(groupClauses))
+                       root->query_pathkeys = 
make_pathkeys_for_sortclauses(root,
+                                                                               
                                                 groupClauses,
+                                                                               
                                                 tlist);
+               else
+                       root->query_pathkeys = NIL;
+       }
        else
                root->query_pathkeys = NIL;
 }
diff --git a/src/backend/optimizer/prep/prepunion.c 
b/src/backend/optimizer/prep/prepunion.c
index 0c68ec011b..ac89c70c50 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -47,10 +47,12 @@
 
 
 static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
+                                                                               
  SetOperationStmt *top_union,
                                                                                
  List *colTypes, List *colCollations,
                                                                                
  bool junkOK,
                                                                                
  int flag, List *refnames_tlist,
                                                                                
  List **pTargetList,
+                                                                               
  List **first_child_pathkeys,
                                                                                
  double *pNumGroups);
 static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
                                                                                
   PlannerInfo *root,
@@ -65,9 +67,8 @@ static RelOptInfo *generate_nonunion_paths(SetOperationStmt 
*op, PlannerInfo *ro
 static List *plan_union_children(PlannerInfo *root,
                                                                 
SetOperationStmt *top_union,
                                                                 List 
*refnames_tlist,
-                                                                List 
**tlist_list);
-static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-                                                          PlannerInfo *root);
+                                                                List 
**tlist_list,
+                                                                List 
**first_child_pathkeys);
 static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
                                                                Path 
*input_path,
@@ -84,7 +85,6 @@ static List *generate_append_tlist(List *colTypes, List 
*colCollations,
                                                                   bool flag,
                                                                   List 
*input_tlists,
                                                                   List 
*refnames_tlist);
-static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 
 
 /*
@@ -166,11 +166,12 @@ plan_set_operations(PlannerInfo *root)
                 * output from the top-level node, plus possibly resjunk working
                 * columns (we can rely on upper-level nodes to deal with that).
                 */
-               setop_rel = recurse_set_operations((Node *) topop, root,
+               setop_rel = recurse_set_operations((Node *) topop, root, topop,
                                                                                
   topop->colTypes, topop->colCollations,
                                                                                
   true, -1,
                                                                                
   leftmostQuery->targetList,
                                                                                
   &top_tlist,
+                                                                               
   NULL,
                                                                                
   NULL);
        }
 
@@ -206,10 +207,12 @@ plan_set_operations(PlannerInfo *root)
  */
 static RelOptInfo *
 recurse_set_operations(Node *setOp, PlannerInfo *root,
+                                          SetOperationStmt *topop,
                                           List *colTypes, List *colCollations,
                                           bool junkOK,
                                           int flag, List *refnames_tlist,
                                           List **pTargetList,
+                                          List **first_child_pathkeys,
                                           double *pNumGroups)
 {
        RelOptInfo *rel = NULL;         /* keep compiler quiet */
@@ -224,10 +227,10 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
                Query      *subquery = rte->subquery;
                PlannerInfo *subroot;
                RelOptInfo *final_rel;
-               Path       *subpath;
-               Path       *path;
                List       *tlist;
+               List       *child_pathkeys;
                bool            trivial_tlist;
+               ListCell   *lc;
 
                Assert(subquery != NULL);
 
@@ -263,6 +266,26 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
                /* Return the fully-fledged tlist to caller, too */
                *pTargetList = tlist;
 
+               if (!topop->all && first_child_pathkeys != NULL &&
+                       grouping_is_sortable(topop->groupClauses))
+               {
+                       if (*first_child_pathkeys == NIL)
+                       {
+                               child_pathkeys = 
build_union_child_pathkeys(root,
+                                                                               
                                        topop,
+                                                                               
                                        rel,
+                                                                               
                                        tlist);
+                               *first_child_pathkeys = child_pathkeys;
+                       }
+                       else
+                       {
+                               add_union_rel_equivalences(root,
+                                                                               
   rel->relids,
+                                                                               
   tlist,
+                                                                               
   *first_child_pathkeys);
+                       }
+               }
+
                /*
                 * Mark rel with estimated output rows, width, etc.  Note that 
we have
                 * to do this before generating outer-query paths, else
@@ -278,44 +301,61 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
                rel->consider_parallel = final_rel->consider_parallel;
 
                /*
-                * For the moment, we consider only a single Path for the 
subquery.
-                * This should change soon (make it look more like
-                * set_subquery_pathlist).
-                */
-               subpath = get_cheapest_fractional_path(final_rel,
-                                                                               
           root->tuple_fraction);
-
-               /*
-                * Stick a SubqueryScanPath atop that.
-                *
-                * We don't bother to determine the subquery's output ordering 
since
-                * it won't be reflected in the set-op result anyhow; so just 
label
-                * the SubqueryScanPath with nil pathkeys.  (XXX that should 
change
-                * soon too, likely.)
+                * For each Path that subquery_planner produced, make a
+                * SubqueryScanPath in the outer query.
                 */
-               path = (Path *) create_subqueryscan_path(root, rel, subpath,
-                                                                               
                 trivial_tlist,
-                                                                               
                 NIL, NULL);
-
-               add_path(rel, path);
+               foreach(lc, final_rel->pathlist)
+               {
+                       Path       *subpath = (Path *) lfirst(lc);
+                       List       *pathkeys;
+
+                       /* Convert subpath's pathkeys to outer representation */
+                       pathkeys = convert_subquery_pathkeys(root,
+                                                                               
                 rel,
+                                                                               
                 subpath->pathkeys,
+                                                                               
                 make_tlist_from_pathtarget(subpath->pathtarget));
+
+                       /* Generate outer path using this subpath */
+                       add_path(rel,
+                                        (Path *) create_subqueryscan_path(root,
+                                                                               
                           rel,
+                                                                               
                           subpath,
+                                                                               
                           trivial_tlist,
+                                                                               
                           pathkeys,
+                                                                               
                           NULL));
+               }
 
-               /*
-                * If we have a partial path for the child relation, we can use 
that
-                * to build a partial path for this relation.  But there's no 
point in
-                * considering any path but the cheapest.
-                */
-               if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) 
&&
-                       final_rel->partial_pathlist != NIL)
+               /* If outer rel allows parallelism, do same for partial paths. 
*/
+               if (rel->consider_parallel && bms_is_empty(rel->lateral_relids))
                {
-                       Path       *partial_subpath;
-                       Path       *partial_path;
-
-                       partial_subpath = linitial(final_rel->partial_pathlist);
-                       partial_path = (Path *)
-                               create_subqueryscan_path(root, rel, 
partial_subpath,
-                                                                               
 trivial_tlist,
-                                                                               
 NIL, NULL);
-                       add_partial_path(rel, partial_path);
+                       /*
+                        * If consider_parallel is false, there should be no 
partial
+                        * paths.
+                        */
+                       Assert(final_rel->consider_parallel ||
+                                  final_rel->partial_pathlist == NIL);
+
+                       /* Same for partial paths. */
+                       foreach(lc, final_rel->partial_pathlist)
+                       {
+                               Path       *subpath = (Path *) lfirst(lc);
+                               List       *pathkeys;
+
+                               /* Convert subpath's pathkeys to outer 
representation */
+                               pathkeys = convert_subquery_pathkeys(root,
+                                                                               
                         rel,
+                                                                               
                         subpath->pathkeys,
+                                                                               
                         make_tlist_from_pathtarget(subpath->pathtarget));
+
+                               /* Generate outer path using this subpath */
+                               add_partial_path(rel, (Path *)
+                                                                
create_subqueryscan_path(root,
+                                                                               
                                  rel,
+                                                                               
                                  subpath,
+                                                                               
                                  trivial_tlist,
+                                                                               
                                  pathkeys,
+                                                                               
                                  NULL));
+                       }
                }
 
                /*
@@ -338,11 +378,11 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
                        if (subquery->groupClause || subquery->groupingSets ||
                                subquery->distinctClause ||
                                subroot->hasHavingQual || subquery->hasAggs)
-                               *pNumGroups = subpath->rows;
+                               *pNumGroups = final_rel->rows;
                        else
                                *pNumGroups = estimate_num_groups(subroot,
                                                                                
                  get_tlist_exprs(subquery->targetList, false),
-                                                                               
                  subpath->rows,
+                                                                               
                  final_rel->rows,
                                                                                
                  NULL,
                                                                                
                  NULL);
                }
@@ -464,20 +504,22 @@ generate_recursion_path(SetOperationStmt *setOp, 
PlannerInfo *root,
         * Unlike a regular UNION node, process the left and right inputs
         * separately without any intention of combining them into one Append.
         */
-       lrel = recurse_set_operations(setOp->larg, root,
+       lrel = recurse_set_operations(setOp->larg, root, setOp,
                                                                  
setOp->colTypes, setOp->colCollations,
                                                                  false, -1,
                                                                  
refnames_tlist,
                                                                  &lpath_tlist,
+                                                                 NULL,
                                                                  NULL);
        lpath = lrel->cheapest_total_path;
        /* The right path will want to look at the left one ... */
        root->non_recursive_path = lpath;
-       rrel = recurse_set_operations(setOp->rarg, root,
+       rrel = recurse_set_operations(setOp->rarg, root, setOp,
                                                                  
setOp->colTypes, setOp->colCollations,
                                                                  false, -1,
                                                                  
refnames_tlist,
                                                                  &rpath_tlist,
+                                                                 NULL,
                                                                  NULL);
        rpath = rrel->cheapest_total_path;
        root->non_recursive_path = NULL;
@@ -553,13 +595,18 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo 
*root,
        double          save_fraction = root->tuple_fraction;
        ListCell   *lc;
        List       *pathlist = NIL;
+       List       *ordered_pathlist = NIL;
        List       *partial_pathlist = NIL;
        bool            partial_paths_valid = true;
        bool            consider_parallel = true;
        List       *rellist;
        List       *tlist_list;
        List       *tlist;
-       Path       *path;
+       List       *first_child_pathkeys = NIL;
+       List       *groupList = NIL;
+       Path       *apath;
+       Path       *gpath = NULL;
+       bool            try_sorted;
 
        /*
         * If plain UNION, tell children to fetch all tuples.
@@ -581,7 +628,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo 
*root,
         * only one Append and unique-ification for the lot.  Recurse to find 
such
         * nodes and compute their children's paths.
         */
-       rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+       rellist = plan_union_children(root,
+                                                                 op,
+                                                                 
refnames_tlist,
+                                                                 &tlist_list,
+                                                                 
&first_child_pathkeys);
 
        /*
         * Generate tlist for Append plan node.
@@ -593,15 +644,47 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo 
*root,
        tlist = generate_append_tlist(op->colTypes, op->colCollations, false,
                                                                  tlist_list, 
refnames_tlist);
 
+       /* For for UNIONs (not UNION ALL), try sorting, if sorting is possible 
*/
+       try_sorted = !op->all && grouping_is_sortable(op->groupClauses);
+
+       if (try_sorted)
+       {
+               /* add equivalence members for top-level target list */
+               add_union_rel_equivalences(root,
+                                                                  
bms_make_singleton(0),
+                                                                  tlist,
+                                                                  
first_child_pathkeys);
+               groupList = generate_setop_grouplist(op, tlist);
+       }
+
        *pTargetList = tlist;
 
        /* Build path lists and relid set. */
        foreach(lc, rellist)
        {
                RelOptInfo *rel = lfirst(lc);
+               Path       *ordered_path;
 
                pathlist = lappend(pathlist, rel->cheapest_total_path);
 
+               if (try_sorted)
+               {
+                       ordered_path = 
get_cheapest_path_for_pathkeys(rel->pathlist,
+                                                                               
                                  first_child_pathkeys,
+                                                                               
                                  NULL,
+                                                                               
                                  TOTAL_COST,
+                                                                               
                                  true);
+
+                       /*
+                        * XXX should we sort the cheapest path if we can't 
find a path
+                        * that's correctly ordered?
+                        */
+                       if (ordered_path != NULL)
+                               ordered_pathlist = lappend(ordered_pathlist, 
ordered_path);
+                       else
+                               try_sorted = false;
+               }
+
                if (consider_parallel)
                {
                        if (!rel->consider_parallel)
@@ -627,24 +710,15 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo 
*root,
        /*
         * Append the child results together.
         */
-       path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
-                                                                          NIL, 
NULL, 0, false, -1);
-
-       /*
-        * For UNION ALL, we just need the Append path.  For UNION, need to add
-        * node(s) to remove duplicates.
-        */
-       if (!op->all)
-               path = make_union_unique(op, path, tlist, root);
-
-       add_path(result_rel, path);
+       apath = (Path *) create_append_path(root, result_rel, pathlist, NIL,
+                                                                               
NIL, NULL, 0, false, -1);
 
        /*
         * Estimate number of groups.  For now we just assume the output is 
unique
         * --- this is certainly true for the UNION case, and we want worst-case
         * estimates anyway.
         */
-       result_rel->rows = path->rows;
+       result_rel->rows = apath->rows;
 
        /*
         * Now consider doing the same thing using the partial paths plus Append
@@ -652,7 +726,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo 
*root,
         */
        if (partial_paths_valid)
        {
-               Path       *ppath;
+               Path       *papath;
                int                     parallel_workers = 0;
 
                /* Find the highest number of workers requested for any 
subpath. */
@@ -681,19 +755,135 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo 
*root,
                }
                Assert(parallel_workers > 0);
 
-               ppath = (Path *)
+               papath = (Path *)
                        create_append_path(root, result_rel, NIL, 
partial_pathlist,
-                                                          NIL, NULL,
-                                                          parallel_workers, 
enable_parallel_append,
-                                                          -1);
-               ppath = (Path *)
-                       create_gather_path(root, result_rel, ppath,
-                                                          
result_rel->reltarget, NULL, NULL);
-               if (!op->all)
-                       ppath = make_union_unique(op, ppath, tlist, root);
-               add_path(result_rel, ppath);
+                                                          NIL, NULL, 
parallel_workers,
+                                                          
enable_parallel_append, -1);
+               gpath = (Path *)
+                       create_gather_path(root, result_rel,
+                                                          papath, 
result_rel->reltarget, NULL, NULL);
        }
 
+       if (!op->all)
+       {
+               double          dNumGroups;
+               bool            can_sort = grouping_is_sortable(groupList);
+               bool            can_hash = grouping_is_hashable(groupList);
+
+               /*
+                * XXX for the moment, take the number of distinct groups as 
equal to
+                * the total input size, ie, the worst case.  This is too
+                * conservative, but it's not clear how to get a decent 
estimate of
+                * the true size.  One should note as well the propensity of 
novices
+                * to write UNION rather than UNION ALL even when they don't 
expect
+                * any duplicates...
+                */
+               dNumGroups = apath->rows;
+
+               if (can_hash)
+               {
+                       Path       *path;
+
+                       /* Hashed aggregate plan --- no sort needed */
+                       path = (Path *) create_agg_path(root,
+                                                                               
        result_rel,
+                                                                               
        apath,
+                                                                               
        create_pathtarget(root, tlist),
+                                                                               
        AGG_HASHED,
+                                                                               
        AGGSPLIT_SIMPLE,
+                                                                               
        groupList,
+                                                                               
        NIL,
+                                                                               
        NULL,
+                                                                               
        dNumGroups);
+                       add_path(result_rel, path);
+
+                       /* Try hash aggregate on the Gather path, if valid */
+                       if (gpath != NULL)
+                       {
+                               /* Hashed aggregate plan --- no sort needed */
+                               path = (Path *) create_agg_path(root,
+                                                                               
                result_rel,
+                                                                               
                gpath,
+                                                                               
                create_pathtarget(root, tlist),
+                                                                               
                AGG_HASHED,
+                                                                               
                AGGSPLIT_SIMPLE,
+                                                                               
                groupList,
+                                                                               
                NIL,
+                                                                               
                NULL,
+                                                                               
                dNumGroups);
+                               add_path(result_rel, path);
+                       }
+               }
+
+               if (can_sort)
+               {
+                       Path       *path = apath;
+
+                       if (groupList != NIL)
+                               path = (Path *) create_sort_path(root, 
result_rel, path,
+                                                                               
                 make_pathkeys_for_sortclauses(root, groupList, tlist),
+                                                                               
                 -1.0);
+
+                       path = (Path *) create_upper_unique_path(root,
+                                                                               
                         result_rel,
+                                                                               
                         path,
+                                                                               
                         list_length(path->pathkeys),
+                                                                               
                         dNumGroups);
+
+                       add_path(result_rel, path);
+
+                       /* Try Sort -> Unique on the Gather path, if set */
+                       if (gpath != NULL)
+                       {
+                               path = gpath;
+
+                               path = (Path *) create_sort_path(root, 
result_rel, path,
+                                                                               
                 make_pathkeys_for_sortclauses(root, groupList, tlist),
+                                                                               
                 -1.0);
+
+                               path = (Path *) create_upper_unique_path(root,
+                                                                               
                                 result_rel,
+                                                                               
                                 path,
+                                                                               
                                 list_length(path->pathkeys),
+                                                                               
                                 dNumGroups);
+                               add_path(result_rel, path);
+                       }
+               }
+
+               /*
+                * Try making a MergeAppend path if we managed to find a path 
with the
+                * correct pathkeys for each union child.
+                */
+               if (try_sorted && groupList != NIL)
+               {
+                       Path       *path;
+
+                       path = (Path *) create_merge_append_path(root,
+                                                                               
                         result_rel,
+                                                                               
                         ordered_pathlist,
+                                                                               
                         first_child_pathkeys,
+                                                                               
                         NULL);
+
+                       /* and make the MergeAppend unique */
+                       path = (Path *) create_upper_unique_path(root,
+                                                                               
                         result_rel,
+                                                                               
                         path,
+                                                                               
                         list_length(tlist),
+                                                                               
                         path->rows);
+
+                       add_path(result_rel, path);
+               }
+       }
+       else
+       {
+               /* UNION ALL */
+               add_path(result_rel, apath);
+
+               if (gpath != NULL)
+                       add_path(result_rel, gpath);
+       }
+
+
        /* Undo effects of possibly forcing tuple_fraction to 0 */
        root->tuple_fraction = save_fraction;
 
@@ -735,18 +925,20 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo 
*root,
        root->tuple_fraction = 0.0;
 
        /* Recurse on children, ensuring their outputs are marked */
-       lrel = recurse_set_operations(op->larg, root,
+       lrel = recurse_set_operations(op->larg, root, op,
                                                                  op->colTypes, 
op->colCollations,
                                                                  false, 0,
                                                                  
refnames_tlist,
                                                                  &lpath_tlist,
+                                                                 NULL,
                                                                  &dLeftGroups);
        lpath = lrel->cheapest_total_path;
-       rrel = recurse_set_operations(op->rarg, root,
+       rrel = recurse_set_operations(op->rarg, root, op,
                                                                  op->colTypes, 
op->colCollations,
                                                                  false, 1,
                                                                  
refnames_tlist,
                                                                  &rpath_tlist,
+                                                                 NULL,
                                                                  
&dRightGroups);
        rpath = rrel->cheapest_total_path;
 
@@ -884,7 +1076,8 @@ static List *
 plan_union_children(PlannerInfo *root,
                                        SetOperationStmt *top_union,
                                        List *refnames_tlist,
-                                       List **tlist_list)
+                                       List **tlist_list,
+                                       List **first_child_pathkeys)
 {
        List       *pending_rels = list_make1(top_union);
        List       *result = NIL;
@@ -923,12 +1116,13 @@ plan_union_children(PlannerInfo *root,
                 * we have an EXCEPT or INTERSECT as child, else there won't be
                 * resjunk anyway.
                 */
-               result = lappend(result, recurse_set_operations(setOp, root,
+               result = lappend(result, recurse_set_operations(setOp, root, 
top_union,
                                                                                
                                top_union->colTypes,
                                                                                
                                top_union->colCollations,
                                                                                
                                false, -1,
                                                                                
                                refnames_tlist,
                                                                                
                                &child_tlist,
+                                                                               
                                first_child_pathkeys,
                                                                                
                                NULL));
                *tlist_list = lappend(*tlist_list, child_tlist);
        }
@@ -936,68 +1130,6 @@ plan_union_children(PlannerInfo *root,
        return result;
 }
 
-/*
- * Add nodes to the given path tree to unique-ify the result of a UNION.
- */
-static Path *
-make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-                                 PlannerInfo *root)
-{
-       RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-       List       *groupList;
-       double          dNumGroups;
-
-       /* Identify the grouping semantics */
-       groupList = generate_setop_grouplist(op, tlist);
-
-       /*
-        * XXX for the moment, take the number of distinct groups as equal to 
the
-        * total input size, ie, the worst case.  This is too conservative, but
-        * it's not clear how to get a decent estimate of the true size.  One
-        * should note as well the propensity of novices to write UNION rather
-        * than UNION ALL even when they don't expect any duplicates...
-        */
-       dNumGroups = path->rows;
-
-       /* Decide whether to hash or sort */
-       if (choose_hashed_setop(root, groupList, path,
-                                                       dNumGroups, dNumGroups,
-                                                       "UNION"))
-       {
-               /* Hashed aggregate plan --- no sort needed */
-               path = (Path *) create_agg_path(root,
-                                                                               
result_rel,
-                                                                               
path,
-                                                                               
create_pathtarget(root, tlist),
-                                                                               
AGG_HASHED,
-                                                                               
AGGSPLIT_SIMPLE,
-                                                                               
groupList,
-                                                                               
NIL,
-                                                                               
NULL,
-                                                                               
dNumGroups);
-       }
-       else
-       {
-               /* Sort and Unique */
-               if (groupList)
-                       path = (Path *)
-                               create_sort_path(root,
-                                                                result_rel,
-                                                                path,
-                                                                
make_pathkeys_for_sortclauses(root,
-                                                                               
                                           groupList,
-                                                                               
                                           tlist),
-                                                                -1.0);
-               path = (Path *) create_upper_unique_path(root,
-                                                                               
                 result_rel,
-                                                                               
                 path,
-                                                                               
                 list_length(path->pathkeys),
-                                                                               
                 dNumGroups);
-       }
-
-       return path;
-}
-
 /*
  * postprocess_setop_rel - perform steps required after adding paths
  */
@@ -1403,7 +1535,7 @@ generate_append_tlist(List *colTypes, List *colCollations,
  * setop.  So what we need to do here is copy that list and install
  * proper sortgrouprefs into it (copying those from the targetlist).
  */
-static List *
+List *
 generate_setop_grouplist(SetOperationStmt *op, List *targetlist)
 {
        List       *grouplist = copyObject(op->groupClauses);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9e7408c7ec..b945c58a35 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -164,6 +164,9 @@ extern EquivalenceClass 
*match_eclasses_to_foreign_key_col(PlannerInfo *root,
                                                                                
                                   int colno);
 extern RestrictInfo *find_derived_clause_for_ec_member(EquivalenceClass *ec,
                                                                                
                           EquivalenceMember *em);
+extern void add_union_rel_equivalences(PlannerInfo *root, Relids relids,
+                                                                          List 
*tlist,
+                                                                          List 
*union_pathkeys);
 extern void add_child_rel_equivalences(PlannerInfo *root,
                                                                           
AppendRelInfo *appinfo,
                                                                           
RelOptInfo *parent_rel,
@@ -217,6 +220,8 @@ extern List *build_index_pathkeys(PlannerInfo *root, 
IndexOptInfo *index,
                                                                  ScanDirection 
scandir);
 extern List *build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
                                                                          
ScanDirection scandir, bool *partialkeys);
+extern List *build_union_child_pathkeys(PlannerInfo *root, SetOperationStmt 
*op,
+                                                                               
RelOptInfo *rel, List *tlist);
 extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
                                                                          Oid 
opno,
                                                                          
Relids rel, bool create_it);
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 54fd61c9c3..3b7641f2b6 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -53,6 +53,6 @@ extern void preprocess_aggrefs(PlannerInfo *root, Node 
*clause);
  * prototypes for prepunion.c
  */
 extern RelOptInfo *plan_set_operations(PlannerInfo *root);
-
+extern List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 
 #endif                                                 /* PREP_H */
diff --git a/src/test/regress/expected/union.out 
b/src/test/regress/expected/union.out
index 64cebe4833..04492c961e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -370,16 +370,16 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                     QUERY PLAN                                
     
-------------------------------------------------------------------------------------
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  HashSetOp Intersect
                ->  Append
-                     ->  Subquery Scan on "*SELECT* 2"
-                           ->  Seq Scan on tenk1
                      ->  Subquery Scan on "*SELECT* 1"
-                           ->  Index Only Scan using tenk1_unique1 on tenk1 
tenk1_1
+                           ->  Index Only Scan using tenk1_unique1 on tenk1
+                     ->  Subquery Scan on "*SELECT* 2"
+                           ->  Seq Scan on tenk1 tenk1_1
 (8 rows)
 
 select count(*) from
@@ -433,18 +433,18 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                        QUERY PLAN                             
           
-------------------------------------------------------------------------------------------
+                                    QUERY PLAN                                 
   
+----------------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  SetOp Intersect
                ->  Sort
-                     Sort Key: "*SELECT* 2".fivethous
+                     Sort Key: "*SELECT* 1".unique1
                      ->  Append
-                           ->  Subquery Scan on "*SELECT* 2"
-                                 ->  Seq Scan on tenk1
                            ->  Subquery Scan on "*SELECT* 1"
-                                 ->  Index Only Scan using tenk1_unique1 on 
tenk1 tenk1_1
+                                 ->  Index Only Scan using tenk1_unique1 on 
tenk1
+                           ->  Subquery Scan on "*SELECT* 2"
+                                 ->  Seq Scan on tenk1 tenk1_1
 (10 rows)
 
 select count(*) from
@@ -954,7 +954,7 @@ explain (costs off)
 select from generate_series(1,5) union select from generate_series(1,3);
                            QUERY PLAN                           
 ----------------------------------------------------------------
- HashAggregate
+ Unique
    ->  Append
          ->  Function Scan on generate_series
          ->  Function Scan on generate_series generate_series_1
@@ -1102,16 +1102,17 @@ explain (costs off)
   UNION
   SELECT * FROM t2) t
  WHERE ab = 'ab';
-                    QUERY PLAN                     
----------------------------------------------------
- HashAggregate
-   Group Key: ((t1.a || t1.b))
-   ->  Append
-         ->  Index Scan using t1_ab_idx on t1
-               Index Cond: ((a || b) = 'ab'::text)
-         ->  Index Only Scan using t2_pkey on t2
-               Index Cond: (ab = 'ab'::text)
-(7 rows)
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: ((t1.a || t1.b))
+         ->  Append
+               ->  Index Scan using t1_ab_idx on t1
+                     Index Cond: ((a || b) = 'ab'::text)
+               ->  Index Only Scan using t2_pkey on t2
+                     Index Cond: (ab = 'ab'::text)
+(8 rows)
 
 --
 -- Test that ORDER BY for UNION ALL can be pushed down to inheritance

Reply via email to