It would be nice, if we could push down join quals into subqueries. For example:

create table small_table (i int);
create table big_table (i int, j int);

insert into small_table values (1), (2); -- two rows
insert into big_table select g/10, g from generate_series(1, 100000) g; -- million rows

create index on big_table(i);

select *
from small_table,
 (select i, count(*) from big_table group by i) as sq
where small_table.i = sq.i;

Currently, we will fully materialize the subquery, and then filter the rows find the rows that match the outer query:

QUERY PLAN
-----------------------------------------------------------------------------------
 Hash Join  (cost=1949.50..2017.08 rows=2550 width=16)
   Hash Cond: (small_table.i = big_table.i)
   ->  Seq Scan on small_table  (cost=0.00..35.50 rows=2550 width=4)
   ->  Hash  (cost=1947.00..1947.00 rows=200 width=12)
         ->  HashAggregate  (cost=1943.00..1945.00 rows=200 width=12)
               Group Key: big_table.i
-> Seq Scan on big_table (cost=0.00..1443.00 rows=100000 width=4)
(7 rows)

For this query, a nested loop join, with an index scan on big_table, would be a better plan. Doing that would require creating parameterized paths, planning the subquery multiple times, with and without the pushed-down join quals.

However, there's one special case: What if the subquery is LATERAL? In that case, the subquery is already parameterized. We could push down join quals, which refer the same "other" relations that are already laterally referenced, without making it any more lateral.

Attached is a patch to do that (the first patch). I'm not too familiar with this code; does it look sane?

One thing that I wasn't clear on, is the intended behavior of ReplaceVarsFromTargetList(). In the patch, when a join qual is pushed down, the Vars in the join clause that refer to the subquery's output, are replaced with the subquery's target list entries. That's the same thing we do when pushing down regular, non-join quals, too. But when pushing down a join qual, the qual will also include Vars for other relations on the same level, not just Vars for the subquery. When the qual is pushed down to the subquery, the Vars referring to other relations need to have their varlevelsup incremented. In the patch, I solved this by first calling IncrementVarSublevelsUp() on the qual, to bump up varlevelsup for all Vars, and then ReplaceVarsFromTargetList(), with sublevels_up = 1. But that way, when ReplaceVarsFromTargetList() replaces Vars with the target list entries, it also adjusts varlevelsup in any Vars in the target list entry. I changed it to not do that anymore. All the existing callers call ReplaceVarsFromTargetList() with sublevels_up == 0, so they shouldn't be affected, but I wonder what the original intention here was?


Attached is also a second, much more work-in-progress patch, that expands the push-down support to non-LATERAL subqueries. Yes, that means that the subquery is planned multiple times. I think that needs some further heuristics, or perhaps we could pass the list of "potential" join clauses to the subquery when it's planned for the first time, and have subquery_planner() tell which of them might be useful, e.g. because there are indexes to back them.

- Heikki
>From d37eb5003b1fcbf25355007821f06c41d23d461c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Wed, 6 Jun 2018 15:58:49 +0300
Subject: [PATCH 1/2] Push down join quals into lateral subqueries.

We don't normally push down join quals into subqueries, because that would
require creating parameterized plans. However, if the plan is already
parameterized because it's LATERAL, we might as well push down any
additional join quals, that refer the same relations that are already
referenced within the subquery.

This changes the behavior of the sublevels_up parameters to
ReplaceVarsFromTargetList(). The targetlist entries used to replace vars are
no longer offset by that amount. I'm not sure what the original thinking
on it was, but all the existing callers passed sublevels_up = 0, so I hope
this is OK..
---
 src/backend/optimizer/path/allpaths.c            | 149 ++++++++++++++++++-----
 src/backend/optimizer/plan/createplan.c          |  30 +++++
 src/backend/optimizer/prep/prepunion.c           |   4 +-
 src/backend/optimizer/util/pathnode.c            |   6 +-
 src/backend/rewrite/rewriteManip.c               |   6 +-
 src/include/nodes/relation.h                     |   1 +
 src/include/optimizer/pathnode.h                 |   2 +-
 src/test/regress/expected/subselect_pushdown.out |  91 ++++++++++++++
 src/test/regress/parallel_schedule               |   4 +-
 src/test/regress/serial_schedule                 |   3 +-
 src/test/regress/sql/subselect_pushdown.sql      |  54 ++++++++
 11 files changed, 310 insertions(+), 40 deletions(-)
 create mode 100644 src/test/regress/expected/subselect_pushdown.out
 create mode 100644 src/test/regress/sql/subselect_pushdown.sql

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 477b9f7fb8..35537272a3 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -131,12 +131,11 @@ static bool targetIsInAllPartitionLists(TargetEntry *tle, Query *query);
 static bool qual_is_pushdown_safe(Query *subquery, Index rti, Node *qual,
 					  pushdown_safety_info *safetyInfo);
 static void subquery_push_qual(Query *subquery,
-				   RangeTblEntry *rte, Index rti, Node *qual);
+				   RangeTblEntry *rte, Index rti, Node *qual, int sublevels_up);
 static void recurse_push_qual(Node *setOp, Query *topquery,
-				  RangeTblEntry *rte, Index rti, Node *qual);
+				  RangeTblEntry *rte, Index rti, Node *qual, int sublevels_up);
 static void remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel);
 
-
 /*
  * make_one_rel
  *	  Finds all possible access paths for executing a query, returning a
@@ -2070,6 +2069,11 @@ has_multiple_baserels(PlannerInfo *root)
  * So the paths made here will be parameterized if the subquery contains
  * LATERAL references, otherwise not.  As long as that's true, there's no need
  * for a separate set_subquery_size phase: just make the paths right away.
+ *
+ * (If a subquery is LATERAL, though, we do push down join clauses that refer
+ * to relations that the subquery already references laterally.  Pushing down
+ * such quals won't make the subquery any more lateral, so there's no reason
+ * not to.)
  */
 static void
 set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -2082,6 +2086,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	double		tuple_fraction;
 	RelOptInfo *sub_final_rel;
 	ListCell   *lc;
+	List	   *pushed_down_ec_joins = NIL;
 
 	/*
 	 * Must copy the Query so that planning doesn't mess up the RTE contents
@@ -2092,8 +2097,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 
 	/*
 	 * If it's a LATERAL subquery, it might contain some Vars of the current
-	 * query level, requiring it to be treated as parameterized, even though
-	 * we don't support pushing down join quals into subqueries.
+	 * query level, requiring it to be treated as parameterized.
 	 */
 	required_outer = rel->lateral_relids;
 
@@ -2131,38 +2135,112 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	 * pseudoconstant clauses; better to have the gating node above the
 	 * subquery.
 	 *
+	 * Join clauses are only pushed down, if the subquery is LATERAL, and
+	 * the join clause only refers to relations that the subquery already
+	 * depends on.  It might be useful to push down other join clauses, too,
+	 * but then we would need to plan the subquery multiple times, to create
+	 * parameterized paths, which seems too expensive.
+	 *
 	 * Non-pushed-down clauses will get evaluated as qpquals of the
 	 * SubqueryScan node.
 	 *
 	 * XXX Are there any cases where we want to make a policy decision not to
 	 * push down a pushable qual, because it'd result in a worse plan?
 	 */
-	if (rel->baserestrictinfo != NIL &&
+	if ((rel->baserestrictinfo != NIL ||
+		 (!bms_is_empty(required_outer) && (rel->joininfo || rel->has_eclass_joins))) &&
 		subquery_is_pushdown_safe(subquery, subquery, &safetyInfo))
 	{
 		/* OK to consider pushing down individual quals */
-		List	   *upperrestrictlist = NIL;
 		ListCell   *l;
+		Bitmapset *available_relids;
 
-		foreach(l, rel->baserestrictinfo)
+		if (rel->baserestrictinfo)
 		{
-			RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
-			Node	   *clause = (Node *) rinfo->clause;
+			List	   *upperrestrictlist = NIL;
 
-			if (!rinfo->pseudoconstant &&
-				qual_is_pushdown_safe(subquery, rti, clause, &safetyInfo))
+			foreach(l, rel->baserestrictinfo)
 			{
-				/* Push it down */
-				subquery_push_qual(subquery, rte, rti, clause);
+				RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
+				Node	   *clause = (Node *) rinfo->clause;
+
+				if (!rinfo->pseudoconstant &&
+					qual_is_pushdown_safe(subquery, rti, clause, &safetyInfo))
+				{
+					/* Push it down */
+					subquery_push_qual(subquery, rte, rti, clause, 0);
+				}
+				else
+				{
+					/* Keep it in the upper query */
+					upperrestrictlist = lappend(upperrestrictlist, rinfo);
+				}
 			}
-			else
+			rel->baserestrictinfo = upperrestrictlist;
+			/* We don't bother recomputing baserestrict_min_security */
+		}
+
+		/*
+		 * Push down join quals, as well.  But only for LATERAL, and only for those
+		 * relations that are "required" anyway.
+		 */
+		if (!bms_is_empty(required_outer))
+		{
+			available_relids = bms_copy(required_outer);
+			available_relids = bms_add_member(available_relids, rti);
+
+			if (rel->joininfo)
 			{
-				/* Keep it in the upper query */
-				upperrestrictlist = lappend(upperrestrictlist, rinfo);
+				ListCell   *lc;
+				List	   *upperjoinlist = NIL;
+
+				foreach(lc, rel->joininfo)
+				{
+					RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+					Node       *clause = (Node *) rinfo->clause;
+
+					if (!rinfo->pseudoconstant &&
+						bms_is_subset(rinfo->required_relids, available_relids) &&
+						qual_is_pushdown_safe(subquery, rti, clause, &safetyInfo))
+					{
+						/* Push it down */
+						subquery_push_qual(subquery, rte, rti, clause, 0);
+					}
+					else
+					{
+						/* Keep it in the upper query */
+						upperjoinlist = lappend(upperjoinlist, rinfo);
+					}
+				}
+				rel->joininfo = upperjoinlist;
+			}
+
+			if (rel->has_eclass_joins)
+			{
+				List	   *clauses;
+
+				clauses = generate_join_implied_equalities(root,
+														   available_relids,
+														   required_outer,
+														   rel);
+
+				foreach(lc, clauses)
+				{
+					RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+					Node       *clause = (Node *) rinfo->clause;
+
+					if (!rinfo->pseudoconstant &&
+						qual_is_pushdown_safe(subquery, rti, clause, &safetyInfo))
+					{
+						/* Push it down */
+						Assert(bms_is_subset(rinfo->required_relids, available_relids));
+						subquery_push_qual(subquery, rte, rti, clause, 0);
+
+						pushed_down_ec_joins = lappend(pushed_down_ec_joins, clause);
+					}
+				}
 			}
 		}
-		rel->baserestrictinfo = upperrestrictlist;
-		/* We don't bother recomputing baserestrict_min_security */
 	}
 
 	pfree(safetyInfo.unsafeColumns);
@@ -2240,7 +2318,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 		/* Generate outer path using this subpath */
 		add_path(rel, (Path *)
 				 create_subqueryscan_path(root, rel, subpath,
-										  pathkeys, required_outer));
+										  pathkeys, required_outer, pushed_down_ec_joins));
 	}
 
 	/* If outer rel allows parallelism, do same for partial paths. */
@@ -2266,7 +2344,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 			add_partial_path(rel, (Path *)
 							 create_subqueryscan_path(root, rel, subpath,
 													  pathkeys,
-													  required_outer));
+													  required_outer, pushed_down_ec_joins));
 		}
 	}
 }
@@ -3174,7 +3252,12 @@ qual_is_pushdown_safe(Query *subquery, Index rti, Node *qual,
 			break;
 		}
 
-		Assert(var->varno == rti);
+		/*
+		 * In a restriction clause, all Vars must refer to subselect output
+		 * columns, but join quals will contain Vars referring to other relations.
+		 */
+		if (var->varno != rti)
+			continue;
 		Assert(var->varattno >= 0);
 
 		/* Check point 4 */
@@ -3201,13 +3284,13 @@ qual_is_pushdown_safe(Query *subquery, Index rti, Node *qual,
  * subquery_push_qual - push down a qual that we have determined is safe
  */
 static void
-subquery_push_qual(Query *subquery, RangeTblEntry *rte, Index rti, Node *qual)
+subquery_push_qual(Query *subquery, RangeTblEntry *rte, Index rti, Node *qual, int sublevels_up)
 {
 	if (subquery->setOperations != NULL)
 	{
 		/* Recurse to push it separately to each component query */
 		recurse_push_qual(subquery->setOperations, subquery,
-						  rte, rti, qual);
+						  rte, rti, qual, sublevels_up);
 	}
 	else
 	{
@@ -3215,12 +3298,18 @@ subquery_push_qual(Query *subquery, RangeTblEntry *rte, Index rti, Node *qual)
 		 * We need to replace Vars in the qual (which must refer to outputs of
 		 * the subquery) with copies of the subquery's targetlist expressions.
 		 * Note that at this point, any uplevel Vars in the qual should have
-		 * been replaced with Params, so they need no work.
+		 * been replaced with Params, so they need no work.  But in a join qual,
+		 * there can be Vars referring to other relations at the same level.
+		 * We need to increment varlevelsup of those, so that when the qual is
+		 * pushed down, they refer to the parent query.
 		 *
 		 * This step also ensures that when we are pushing into a setop tree,
 		 * each component query gets its own copy of the qual.
 		 */
-		qual = ReplaceVarsFromTargetList(qual, rti, 0, rte,
+		qual = copyObject(qual);
+		IncrementVarSublevelsUp(qual, sublevels_up + 1, 0);
+
+		qual = ReplaceVarsFromTargetList(qual, rti, sublevels_up + 1, rte,
 										 subquery->targetList,
 										 REPLACEVARS_REPORT_ERROR, 0,
 										 &subquery->hasSubLinks);
@@ -3249,7 +3338,7 @@ subquery_push_qual(Query *subquery, RangeTblEntry *rte, Index rti, Node *qual)
  */
 static void
 recurse_push_qual(Node *setOp, Query *topquery,
-				  RangeTblEntry *rte, Index rti, Node *qual)
+				  RangeTblEntry *rte, Index rti, Node *qual, int sublevels_up)
 {
 	if (IsA(setOp, RangeTblRef))
 	{
@@ -3258,14 +3347,14 @@ recurse_push_qual(Node *setOp, Query *topquery,
 		Query	   *subquery = subrte->subquery;
 
 		Assert(subquery != NULL);
-		subquery_push_qual(subquery, rte, rti, qual);
+		subquery_push_qual(subquery, rte, rti, qual, sublevels_up + 1);
 	}
 	else if (IsA(setOp, SetOperationStmt))
 	{
 		SetOperationStmt *op = (SetOperationStmt *) setOp;
 
-		recurse_push_qual(op->larg, topquery, rte, rti, qual);
-		recurse_push_qual(op->rarg, topquery, rte, rti, qual);
+		recurse_push_qual(op->larg, topquery, rte, rti, qual, sublevels_up);
+		recurse_push_qual(op->rarg, topquery, rte, rti, qual, sublevels_up);
 	}
 	else
 	{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ca2e0527db..d34c1bf128 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -3186,6 +3186,9 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 	RelOptInfo *rel = best_path->path.parent;
 	Index		scan_relid = rel->relid;
 	Plan	   *subplan;
+	ListCell   *l;
+	List	   *qpqual;
+	List	   *sq_quals = best_path->pushed_down_ec_joins;
 
 	/* it should be a subquery base rel... */
 	Assert(scan_relid > 0);
@@ -3198,6 +3201,33 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 	 */
 	subplan = create_plan(rel->subroot, best_path->subpath);
 
+	/*
+	 * If we had pushed down any join clauses to the subquery, we don't need
+	 * to re-check them in the SubqueryScan node.
+	 *
+	 * This only applies to join clauses derived from equivalence classes.
+	 * Non-join quals, and non-EC-derived join clauses are immediately removed
+	 * from 'baserestrictinfo' and 'joininfo' when they're pushed down, so we
+	 * won't need to worry about them here.
+	 */
+	qpqual = NIL;
+	foreach (l, scan_clauses)
+	{
+		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
+
+		if (rinfo->pseudoconstant)
+			continue;			/* we may drop pseudoconstants here */
+		if (list_member_ptr(sq_quals, rinfo))
+			continue;			/* simple duplicate */
+		if (is_redundant_derived_clause(rinfo, sq_quals))
+			continue;			/* derived from same EquivalenceClass */
+		if (!contain_mutable_functions((Node *) rinfo->clause) &&
+			predicate_implied_by(list_make1(rinfo->clause), sq_quals, false))
+			continue;			/* provably implied by indexquals */
+		qpqual = lappend(qpqual, rinfo);
+	}
+	scan_clauses = qpqual;
+
 	/* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
 
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 0ab4014be6..8f424031be 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -321,7 +321,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		 * soon too, likely.)
 		 */
 		path = (Path *) create_subqueryscan_path(root, rel, subpath,
-												 NIL, NULL);
+												 NIL, NULL, NIL);
 
 		add_path(rel, path);
 
@@ -339,7 +339,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			partial_subpath = linitial(final_rel->partial_pathlist);
 			partial_path = (Path *)
 				create_subqueryscan_path(root, rel, partial_subpath,
-										 NIL, NULL);
+										 NIL, NULL, NIL);
 			add_partial_path(rel, partial_path);
 		}
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e190ad49d1..e4909c1855 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1869,7 +1869,7 @@ create_gather_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
  */
 SubqueryScanPath *
 create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
-						 List *pathkeys, Relids required_outer)
+						 List *pathkeys, Relids required_outer, List *pushed_down_ec_joins)
 {
 	SubqueryScanPath *pathnode = makeNode(SubqueryScanPath);
 
@@ -1884,6 +1884,7 @@ create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = pathkeys;
 	pathnode->subpath = subpath;
+	pathnode->pushed_down_ec_joins = pushed_down_ec_joins;
 
 	cost_subqueryscan(pathnode, root, rel, pathnode->path.param_info);
 
@@ -3562,7 +3563,8 @@ reparameterize_path(PlannerInfo *root, Path *path,
 														 rel,
 														 spath->subpath,
 														 spath->path.pathkeys,
-														 required_outer);
+														 required_outer,
+														 spath->pushed_down_ec_joins);
 			}
 		case T_Append:
 			{
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index f1f4212b5d..5c38e18eb0 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -1400,6 +1400,7 @@ typedef struct
 	List	   *targetlist;
 	ReplaceVarsNoMatchOption nomatch_option;
 	int			nomatch_varno;
+	int			min_sublevels_up;
 } ReplaceVarsFromTargetList_context;
 
 static Node *
@@ -1484,8 +1485,8 @@ ReplaceVarsFromTargetList_callback(Var *var,
 		Expr	   *newnode = copyObject(tle->expr);
 
 		/* Must adjust varlevelsup if tlist item is from higher query */
-		if (var->varlevelsup > 0)
-			IncrementVarSublevelsUp((Node *) newnode, var->varlevelsup, 0);
+		if (var->varlevelsup + rcon->min_sublevels_up > 0)
+			IncrementVarSublevelsUp((Node *) newnode, var->varlevelsup - rcon->min_sublevels_up, 0);
 
 		/*
 		 * Check to see if the tlist item contains a PARAM_MULTIEXPR Param,
@@ -1521,6 +1522,7 @@ ReplaceVarsFromTargetList(Node *node,
 	context.targetlist = targetlist;
 	context.nomatch_option = nomatch_option;
 	context.nomatch_varno = nomatch_varno;
+	context.min_sublevels_up = sublevels_up;
 
 	return replace_rte_variables(node, target_varno, sublevels_up,
 								 ReplaceVarsFromTargetList_callback,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 3b28d1994f..de942b4e40 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1237,6 +1237,7 @@ typedef struct SubqueryScanPath
 {
 	Path		path;
 	Path	   *subpath;		/* path representing subquery execution */
+	List	   *pushed_down_ec_joins; /* pushed-down quals derived from ECs */
 } SubqueryScanPath;
 
 /*
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index e99ae36bef..848839ed15 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -92,7 +92,7 @@ extern GatherMergePath *create_gather_merge_path(PlannerInfo *root,
 						 double *rows);
 extern SubqueryScanPath *create_subqueryscan_path(PlannerInfo *root,
 						 RelOptInfo *rel, Path *subpath,
-						 List *pathkeys, Relids required_outer);
+						 List *pathkeys, Relids required_outer, List *pushed_down_ec_joins);
 extern Path *create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,
 						 List *pathkeys, Relids required_outer);
 extern Path *create_tablexprscan_path(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/test/regress/expected/subselect_pushdown.out b/src/test/regress/expected/subselect_pushdown.out
new file mode 100644
index 0000000000..f160bc84c8
--- /dev/null
+++ b/src/test/regress/expected/subselect_pushdown.out
@@ -0,0 +1,91 @@
+-- Test pushdown of quals into subqueries.
+create table smalltab (i int4, j int4);
+create table bigtab (i int4, j int4);
+insert into smalltab values (1, 1), (100000, 100000);
+insert into bigtab select g,g from generate_series(1, 100000) g;
+analyze smalltab, bigtab;
+create index bigtab_i on bigtab (i);
+-- Push down restriction quals.
+explain (costs off)
+select * from smalltab,
+(
+  select bigtab.i, avg(bigtab.j)
+  from bigtab
+  group by bigtab.i
+) as subq(i, avg)
+where smalltab.i = subq.i and smalltab.i = 123;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Nested Loop
+   ->  Seq Scan on smalltab
+         Filter: (i = 123)
+   ->  GroupAggregate
+         Group Key: bigtab.i
+         ->  Index Scan using bigtab_i on bigtab
+               Index Cond: (i = 123)
+(7 rows)
+
+-- Join quals are not currently pushed down
+explain (costs off)
+select * from smalltab,
+(
+  select bigtab.i, avg(bigtab.j)
+  from bigtab
+  group by bigtab.i
+) as subq(i, avg)
+where smalltab.i = subq.i;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Merge Join
+   Merge Cond: (smalltab.i = bigtab.i)
+   ->  Sort
+         Sort Key: smalltab.i
+         ->  Seq Scan on smalltab
+   ->  GroupAggregate
+         Group Key: bigtab.i
+         ->  Index Scan using bigtab_i on bigtab
+(8 rows)
+
+-- Except when the subquery is LATERAL, and already references the other relation.
+-- Such join clauses can be pushed down.
+explain (costs off)
+select * from smalltab,
+lateral (
+  select bigtab.i, avg(bigtab.j)
+  from bigtab
+  where bigtab.j = smalltab.j
+  group by bigtab.i
+) as subq(i, avg)
+where smalltab.i < subq.i;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Nested Loop
+   ->  Seq Scan on smalltab
+   ->  GroupAggregate
+         Group Key: bigtab.i
+         ->  Index Scan using bigtab_i on bigtab
+               Index Cond: (smalltab.i < i)
+               Filter: (j = smalltab.j)
+(7 rows)
+
+-- Multiple join clauses constructed from equivalence classes
+explain (costs off)
+select * from smalltab,
+lateral (
+  select bigtab.i, bigtab.j, avg(bigtab.j)
+  from bigtab
+  where bigtab.j/2 = smalltab.j / 2
+  group by bigtab.i, bigtab.j
+) as subq(i, j, avg)
+where smalltab.i = subq.i and smalltab.j = subq.j;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Nested Loop
+   ->  Seq Scan on smalltab
+   ->  GroupAggregate
+         Group Key: bigtab.i, bigtab.j
+         ->  Index Scan using bigtab_i on bigtab
+               Index Cond: (smalltab.i = i)
+               Filter: ((smalltab.j = j) AND ((j / 2) = (smalltab.j / 2)))
+(7 rows)
+
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 16f979c8d9..da3c3b9382 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -79,7 +79,7 @@ ignore: random
 # ----------
 # Another group of parallel tests
 # ----------
-test: select_into select_distinct select_distinct_on select_implicit select_having subselect union case join aggregates transactions random portals arrays btree_index hash_index update namespace prepared_xacts delete
+test: select_into select_distinct select_distinct_on select_implicit select_having union case join aggregates transactions random portals arrays btree_index hash_index update namespace prepared_xacts delete
 
 # ----------
 # Another group of parallel tests
@@ -89,7 +89,7 @@ test: brin gin gist spgist privileges init_privs security_label collate matview
 # ----------
 # Another group of parallel tests
 # ----------
-test: alter_generic alter_operator misc psql async dbsize misc_functions sysviews tsrf tidscan stats_ext
+test: alter_generic alter_operator misc psql async dbsize misc_functions sysviews tsrf tidscan stats_ext subselect subselect_pushdown
 
 # rules cannot run concurrently with any test that creates a view
 test: rules psql_crosstab amutils
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 42632be675..f6ab5f5bac 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -90,7 +90,6 @@ test: select_distinct
 test: select_distinct_on
 test: select_implicit
 test: select_having
-test: subselect
 test: union
 test: case
 test: join
@@ -135,6 +134,8 @@ test: sysviews
 test: tsrf
 test: tidscan
 test: stats_ext
+test: subselect
+test: subselect_pushdown
 test: rules
 test: psql_crosstab
 test: select_parallel
diff --git a/src/test/regress/sql/subselect_pushdown.sql b/src/test/regress/sql/subselect_pushdown.sql
new file mode 100644
index 0000000000..c9e7a43794
--- /dev/null
+++ b/src/test/regress/sql/subselect_pushdown.sql
@@ -0,0 +1,54 @@
+-- Test pushdown of quals into subqueries.
+
+create table smalltab (i int4, j int4);
+create table bigtab (i int4, j int4);
+
+insert into smalltab values (1, 1), (100000, 100000);
+insert into bigtab select g,g from generate_series(1, 100000) g;
+
+analyze smalltab, bigtab;
+
+create index bigtab_i on bigtab (i);
+
+-- Push down restriction quals.
+explain (costs off)
+select * from smalltab,
+(
+  select bigtab.i, avg(bigtab.j)
+  from bigtab
+  group by bigtab.i
+) as subq(i, avg)
+where smalltab.i = subq.i and smalltab.i = 123;
+
+-- Join quals are not currently pushed down
+explain (costs off)
+select * from smalltab,
+(
+  select bigtab.i, avg(bigtab.j)
+  from bigtab
+  group by bigtab.i
+) as subq(i, avg)
+where smalltab.i = subq.i;
+
+-- Except when the subquery is LATERAL, and already references the other relation.
+-- Such join clauses can be pushed down.
+explain (costs off)
+select * from smalltab,
+lateral (
+  select bigtab.i, avg(bigtab.j)
+  from bigtab
+  where bigtab.j = smalltab.j
+  group by bigtab.i
+) as subq(i, avg)
+where smalltab.i < subq.i;
+
+-- Multiple join clauses constructed from equivalence classes
+explain (costs off)
+select * from smalltab,
+lateral (
+  select bigtab.i, bigtab.j, avg(bigtab.j)
+  from bigtab
+  where bigtab.j/2 = smalltab.j / 2
+  group by bigtab.i, bigtab.j
+) as subq(i, j, avg)
+where smalltab.i = subq.i and smalltab.j = subq.j;
-- 
2.11.0

>From f2f5a2fd92ed9cfbabc3bcb67cb1a3c5ae99bc7c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Wed, 6 Jun 2018 20:07:31 +0300
Subject: [PATCH 2/2] WIP: Allow pushing join quals down into subqueries.

Plan subqueries a second time, to create parameterized plans with join
quals.
---
 src/backend/nodes/outfuncs.c                     |   2 -
 src/backend/optimizer/path/allpaths.c            | 143 ++++++++++++++++++++---
 src/backend/optimizer/path/costsize.c            |   3 +-
 src/backend/optimizer/plan/createplan.c          |  30 ++++-
 src/backend/optimizer/plan/setrefs.c             |   8 +-
 src/backend/optimizer/plan/subselect.c           |   4 +-
 src/backend/optimizer/prep/prepunion.c           |  22 ++--
 src/backend/optimizer/util/pathnode.c            |  16 ++-
 src/backend/optimizer/util/relnode.c             |   7 +-
 src/backend/utils/adt/selfuncs.c                 |  25 +++-
 src/include/nodes/relation.h                     |   8 +-
 src/include/optimizer/cost.h                     |   2 +-
 src/include/optimizer/pathnode.h                 |   7 +-
 src/test/regress/expected/subselect_pushdown.out |  17 ++-
 src/test/regress/sql/subselect_pushdown.sql      |   7 +-
 15 files changed, 227 insertions(+), 74 deletions(-)

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 610f9edaf5..0bc7855d0d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2340,8 +2340,6 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_UINT_FIELD(pages);
 	WRITE_FLOAT_FIELD(tuples, "%.0f");
 	WRITE_FLOAT_FIELD(allvisfrac, "%.6f");
-	WRITE_NODE_FIELD(subroot);
-	WRITE_NODE_FIELD(subplan_params);
 	WRITE_INT_FIELD(rel_parallel_workers);
 	WRITE_OID_FIELD(serverid);
 	WRITE_OID_FIELD(userid);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 35537272a3..bdcdc30985 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -106,6 +106,11 @@ static void accumulate_append_subpath(Path *path,
 						  List **subpaths, List **special_subpaths);
 static void set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 					  Index rti, RangeTblEntry *rte);
+static void add_subqueryscan_variant(PlannerInfo *root, RelOptInfo *rel,
+						 Index rti, RangeTblEntry *rte,
+						 Bitmapset *required_outer,
+						 Query *subquery, List *pushed_down_clauses, double tuple_fraction,
+						 bool update_estimates);
 static void set_function_pathlist(PlannerInfo *root, RelOptInfo *rel,
 					  RangeTblEntry *rte);
 static void set_values_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -2080,13 +2085,13 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 					  Index rti, RangeTblEntry *rte)
 {
 	Query	   *parse = root->parse;
+	Query	   *unparameterized_subquery;
 	Query	   *subquery = rte->subquery;
 	Relids		required_outer;
 	pushdown_safety_info safetyInfo;
 	double		tuple_fraction;
-	RelOptInfo *sub_final_rel;
-	ListCell   *lc;
 	List	   *pushed_down_ec_joins = NIL;
+	bool		sq_is_pushdown_safe;
 
 	/*
 	 * Must copy the Query so that planning doesn't mess up the RTE contents
@@ -2147,9 +2152,10 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	 * XXX Are there any cases where we want to make a policy decision not to
 	 * push down a pushable qual, because it'd result in a worse plan?
 	 */
-	if ((rel->baserestrictinfo != NIL ||
-		 (!bms_is_empty(required_outer) && (rel->joininfo || rel->has_eclass_joins))) &&
-		subquery_is_pushdown_safe(subquery, subquery, &safetyInfo))
+	sq_is_pushdown_safe = subquery_is_pushdown_safe(subquery, subquery, &safetyInfo);
+	if (sq_is_pushdown_safe &&
+		(rel->baserestrictinfo != NIL ||
+		 (!bms_is_empty(required_outer) && (rel->joininfo || rel->has_eclass_joins))))
 	{
 		/* OK to consider pushing down individual quals */
 		ListCell   *l;
@@ -2218,6 +2224,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 			if (rel->has_eclass_joins)
 			{
 				List	   *clauses;
+				ListCell *lc;
 
 				clauses = generate_join_implied_equalities(root,
 														   available_relids,
@@ -2243,8 +2250,6 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 		}
 	}
 
-	pfree(safetyInfo.unsafeColumns);
-
 	/*
 	 * The upper query might not use all the subquery's output columns; if
 	 * not, we can simplify.
@@ -2268,16 +2273,112 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	else
 		tuple_fraction = root->tuple_fraction;
 
+	unparameterized_subquery = copyObject(subquery);
+
+	add_subqueryscan_variant(root, rel, rti, rte,
+							 required_outer, subquery, pushed_down_ec_joins, tuple_fraction, true);
+
+	/*
+	 * Also create parameterized join paths, where we push the join condition
+	 * down to the subquery.
+	 *
+	 * To keep the planning time reasonable, this is all-or-nothing. We try to
+	 * push all join conditions down to the subquery, and create paths for that.
+	 * We don't create paths for every combination of join conditions that we
+	 * could push down.
+	 */
+	if ((rel->has_eclass_joins || rel->joininfo) &&
+		sq_is_pushdown_safe)
+	{
+		List	   *clauses;
+		ListCell   *lc;
+		List	   *pushed_down_clauses = list_copy(pushed_down_ec_joins);
+		Bitmapset  *available_relids;
+		Bitmapset  *other_relids;
+
+		subquery = copyObject(unparameterized_subquery);
+
+		required_outer = bms_copy(required_outer);
+
+		foreach(lc, rel->joininfo)
+		{
+			RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+			Node	   *clause = (Node *) rinfo->clause;
+
+			if (!rinfo->pseudoconstant &&
+				qual_is_pushdown_safe(subquery, rti, clause, &safetyInfo))
+			{
+				/* Push it down */
+				required_outer = bms_union(required_outer,
+										   pull_varnos(clause));
+				required_outer = bms_del_member(required_outer, rti);
+
+				subquery_push_qual(subquery, rte, rti, clause, 0);
+
+				pushed_down_clauses = lappend(pushed_down_clauses, rinfo);
+			}
+		}
+
+		/*
+		 * We already pushed down any join quals with LATERAL referenced rels, don't add
+		 * them again.
+		 */
+		available_relids = bms_difference(root->all_baserels, rel->lateral_referencers);
+		other_relids = bms_del_member(bms_copy(available_relids), rti);
+
+		clauses = generate_join_implied_equalities(root,
+												   available_relids,
+												   other_relids,
+												   rel);
+		foreach(lc, clauses)
+		{
+			RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+			Node	   *clause = (Node *) rinfo->clause;
+
+			if (!rinfo->pseudoconstant &&
+				qual_is_pushdown_safe(subquery, rti, clause, &safetyInfo))
+			{
+				/* Push it down */
+				required_outer = bms_union(required_outer,
+										   pull_varnos(clause));
+				required_outer = bms_del_member(required_outer, rti);
+
+				subquery_push_qual(subquery, rte, rti, clause, 0);
+
+				pushed_down_clauses = lappend(pushed_down_clauses, rinfo);
+			}
+		}
+		if (pushed_down_clauses)
+			add_subqueryscan_variant(root, rel, rti, rte,
+									 required_outer,
+									 subquery, pushed_down_clauses, tuple_fraction, false);
+	}
+
+	pfree(safetyInfo.unsafeColumns);
+}
+
+static void
+add_subqueryscan_variant(PlannerInfo *root, RelOptInfo *rel,
+						 Index rti, RangeTblEntry *rte,
+						 Bitmapset *required_outer,
+						 Query *subquery, List *pushed_down_clauses, double tuple_fraction,
+						 bool update_estimates)
+{
+	RelOptInfo *sub_final_rel;
+	ListCell   *lc;
+	PlannerInfo *subroot;
+	List	   *subplan_params;
+
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
 	/* Generate a subroot and Paths for the subquery */
-	rel->subroot = subquery_planner(root->glob, subquery,
-									root,
-									false, tuple_fraction);
+	subroot = subquery_planner(root->glob, subquery,
+							   root,
+							   false, tuple_fraction);
 
 	/* Isolate the params needed by this specific subplan */
-	rel->subplan_params = root->plan_params;
+	subplan_params = root->plan_params;
 	root->plan_params = NIL;
 
 	/*
@@ -2285,7 +2386,7 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	 * so, it's desirable to produce an unadorned dummy path so that we will
 	 * recognize appropriate optimizations at this query level.
 	 */
-	sub_final_rel = fetch_upper_rel(rel->subroot, UPPERREL_FINAL, NULL);
+	sub_final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 
 	if (IS_DUMMY_REL(sub_final_rel))
 	{
@@ -2297,8 +2398,13 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	 * Mark rel with estimated output rows, width, etc.  Note that we have to
 	 * do this before generating outer-query paths, else cost_subqueryscan is
 	 * not happy.
+	 *
+	 * Don't overwrite the estimates when we're creating parameterized paths
+	 * for joins. The estimate for a parameterized path includes the effects
+	 * of the join clauses.
 	 */
-	set_subquery_size_estimates(root, rel);
+	if (update_estimates)
+		set_subquery_size_estimates(root, rel, subroot);
 
 	/*
 	 * For each Path that subquery_planner produced, make a SubqueryScanPath
@@ -2317,8 +2423,8 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 
 		/* Generate outer path using this subpath */
 		add_path(rel, (Path *)
-				 create_subqueryscan_path(root, rel, subpath,
-										  pathkeys, required_outer, pushed_down_ec_joins));
+				 create_subqueryscan_path(root, rel, subroot, subplan_params, subpath,
+										  pathkeys, required_outer, pushed_down_clauses));
 	}
 
 	/* If outer rel allows parallelism, do same for partial paths. */
@@ -2342,9 +2448,9 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 
 			/* Generate outer path using this subpath */
 			add_partial_path(rel, (Path *)
-							 create_subqueryscan_path(root, rel, subpath,
+							 create_subqueryscan_path(root, rel, subroot, subplan_params, subpath,
 													  pathkeys,
-													  required_outer, pushed_down_ec_joins));
+													  required_outer, pushed_down_clauses));
 		}
 	}
 }
@@ -3231,8 +3337,7 @@ qual_is_pushdown_safe(Query *subquery, Index rti, Node *qual,
 	Assert(!contain_window_function(qual));
 
 	/*
-	 * Examine all Vars used in clause; since it's a restriction clause, all
-	 * such Vars must refer to subselect output columns.
+	 * Examine all Vars used in clause.
 	 */
 	vars = pull_var_clause(qual, PVC_INCLUDE_PLACEHOLDERS);
 	foreach(vl, vars)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index a2a7e0c520..5f93328e22 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4807,9 +4807,8 @@ get_foreign_key_join_selectivity(PlannerInfo *root,
  * We set the same fields as set_baserel_size_estimates.
  */
 void
-set_subquery_size_estimates(PlannerInfo *root, RelOptInfo *rel)
+set_subquery_size_estimates(PlannerInfo *root, RelOptInfo *rel, PlannerInfo *subroot)
 {
-	PlannerInfo *subroot = rel->subroot;
 	RelOptInfo *sub_final_rel;
 	ListCell   *lc;
 
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index d34c1bf128..8261c3e17c 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -3188,18 +3188,40 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 	Plan	   *subplan;
 	ListCell   *l;
 	List	   *qpqual;
-	List	   *sq_quals = best_path->pushed_down_ec_joins;
+	List	   *sq_quals = best_path->pushed_down_clauses;
 
 	/* it should be a subquery base rel... */
 	Assert(scan_relid > 0);
 	Assert(rel->rtekind == RTE_SUBQUERY);
+	Assert(rel->chosen_plan == NULL);
 
 	/*
 	 * Recursively create Plan from Path for subquery.  Since we are entering
 	 * a different planner context (subroot), recurse to create_plan not
 	 * create_plan_recurse.
 	 */
-	subplan = create_plan(rel->subroot, best_path->subpath);
+	subplan = create_plan(best_path->subroot, best_path->subpath);
+
+	/*
+	 * If this path used join quals that were pushed down to the subquery,
+	 * we don't need to re-check those quals on the SubqueryScan node itself.
+	 */
+	if (best_path->pushed_down_clauses)
+	{
+		List	   *new_clauses = NIL;
+		ListCell   *l;
+
+		foreach(l, scan_clauses)
+		{
+			RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
+
+			if (list_member_ptr(best_path->pushed_down_clauses, rinfo))
+				continue;
+
+			new_clauses = lappend(new_clauses, rinfo);
+		}
+		scan_clauses = new_clauses;
+	}
 
 	/*
 	 * If we had pushed down any join clauses to the subquery, we don't need
@@ -3240,7 +3262,7 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 		scan_clauses = (List *)
 			replace_nestloop_params(root, (Node *) scan_clauses);
 		process_subquery_nestloop_params(root,
-										 rel->subplan_params);
+										 best_path->subplan_params);
 	}
 
 	scan_plan = make_subqueryscan(tlist,
@@ -3250,6 +3272,8 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 
 	copy_generic_path_info(&scan_plan->scan.plan, &best_path->path);
 
+	rel->chosen_plan = best_path->subroot;
+
 	return scan_plan;
 }
 
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 69dd327f0c..29b26c086a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -323,12 +323,12 @@ add_rtes_to_flat_rtable(PlannerInfo *root, bool recursing)
 				 * that some upper query level is treating this one as dummy,
 				 * and so we won't scan this level's plan tree at all.
 				 */
-				if (rel->subroot == NULL)
+				if (rel->chosen_plan == NULL)
 					flatten_unplanned_rtes(glob, rte);
 				else if (recursing ||
-						 IS_DUMMY_REL(fetch_upper_rel(rel->subroot,
+						 IS_DUMMY_REL(fetch_upper_rel(rel->chosen_plan,
 													  UPPERREL_FINAL, NULL)))
-					add_rtes_to_flat_rtable(rel->subroot, true);
+					add_rtes_to_flat_rtable(rel->chosen_plan, true);
 			}
 		}
 		rti++;
@@ -1070,7 +1070,7 @@ set_subqueryscan_references(PlannerInfo *root,
 	rel = find_base_rel(root, plan->scan.scanrelid);
 
 	/* Recursively process the subplan */
-	plan->subplan = set_plan_references(rel->subroot, plan->subplan);
+	plan->subplan = set_plan_references(rel->chosen_plan, plan->subplan);
 
 	if (trivial_subqueryscan(plan))
 	{
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 83008d7661..f756febe03 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2418,11 +2418,11 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 
 				/* We must run finalize_plan on the subquery */
 				rel = find_base_rel(root, sscan->scan.scanrelid);
-				subquery_params = rel->subroot->outer_params;
+				subquery_params = rel->chosen_plan->outer_params;
 				if (gather_param >= 0)
 					subquery_params = bms_add_member(bms_copy(subquery_params),
 													 gather_param);
-				finalize_plan(rel->subroot, sscan->subplan, gather_param,
+				finalize_plan(rel->chosen_plan, sscan->subplan, gather_param,
 							  subquery_params, NULL);
 
 				/* Now we can add its extParams to the parent's params */
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 8f424031be..8bf864f768 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -266,10 +266,10 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		Assert(root->plan_params == NIL);
 
 		/* Generate a subroot and Paths for the subquery */
-		subroot = rel->subroot = subquery_planner(root->glob, subquery,
-												  root,
-												  false,
-												  root->tuple_fraction);
+		subroot = subquery_planner(root->glob, subquery,
+								   root,
+								   false,
+								   root->tuple_fraction);
 
 		/*
 		 * It should not be possible for the primitive query to contain any
@@ -295,7 +295,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		 * to do this before generating outer-query paths, else
 		 * cost_subqueryscan is not happy.
 		 */
-		set_subquery_size_estimates(root, rel);
+		set_subquery_size_estimates(root, rel, subroot);
 
 		/*
 		 * Since we may want to add a partial path to this relation, we must
@@ -320,8 +320,12 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		 * the SubqueryScanPath with nil pathkeys.  (XXX that should change
 		 * soon too, likely.)
 		 */
-		path = (Path *) create_subqueryscan_path(root, rel, subpath,
-												 NIL, NULL, NIL);
+		path = (Path *) create_subqueryscan_path(root, rel,
+												 subroot,
+												 NIL,
+												 subpath,
+												 NIL,
+												 NULL, NIL);
 
 		add_path(rel, path);
 
@@ -338,8 +342,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 			partial_subpath = linitial(final_rel->partial_pathlist);
 			partial_path = (Path *)
-				create_subqueryscan_path(root, rel, partial_subpath,
-										 NIL, NULL, NIL);
+				create_subqueryscan_path(root, rel, subroot, NIL, partial_subpath, NIL,
+										 NULL, NIL);
 			add_partial_path(rel, partial_path);
 		}
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e4909c1855..748e0120f1 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1868,8 +1868,12 @@ create_gather_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
  *	  returning the pathnode.
  */
 SubqueryScanPath *
-create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
-						 List *pathkeys, Relids required_outer, List *pushed_down_ec_joins)
+create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel,
+						 PlannerInfo *subroot,
+						 List *subplan_params,
+						 Path *subpath,
+						 List *pathkeys, Relids required_outer,
+						 List *pushed_down_clauses)
 {
 	SubqueryScanPath *pathnode = makeNode(SubqueryScanPath);
 
@@ -1884,7 +1888,9 @@ create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = pathkeys;
 	pathnode->subpath = subpath;
-	pathnode->pushed_down_ec_joins = pushed_down_ec_joins;
+	pathnode->subplan_params = subplan_params;
+	pathnode->subroot = subroot;
+	pathnode->pushed_down_clauses = pushed_down_clauses;
 
 	cost_subqueryscan(pathnode, root, rel, pathnode->path.param_info);
 
@@ -3561,10 +3567,12 @@ reparameterize_path(PlannerInfo *root, Path *path,
 
 				return (Path *) create_subqueryscan_path(root,
 														 rel,
+														 spath->subroot,
+														 spath->subplan_params,
 														 spath->subpath,
 														 spath->path.pathkeys,
 														 required_outer,
-														 spath->pushed_down_ec_joins);
+														 spath->pushed_down_clauses);
 			}
 		case T_Append:
 			{
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 82b78420e7..b1b89310cb 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -135,8 +135,6 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->pages = 0;
 	rel->tuples = 0;
 	rel->allvisfrac = 0;
-	rel->subroot = NULL;
-	rel->subplan_params = NIL;
 	rel->rel_parallel_workers = -1; /* set up in get_relation_info */
 	rel->serverid = InvalidOid;
 	rel->userid = rte->checkAsUser;
@@ -549,8 +547,6 @@ build_join_rel(PlannerInfo *root,
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
-	joinrel->subroot = NULL;
-	joinrel->subplan_params = NIL;
 	joinrel->rel_parallel_workers = -1;
 	joinrel->serverid = InvalidOid;
 	joinrel->userid = InvalidOid;
@@ -724,8 +720,7 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->pages = 0;
 	joinrel->tuples = 0;
 	joinrel->allvisfrac = 0;
-	joinrel->subroot = NULL;
-	joinrel->subplan_params = NIL;
+	joinrel->chosen_plan = NULL;
 	joinrel->serverid = InvalidOid;
 	joinrel->userid = InvalidOid;
 	joinrel->useridiscurrent = false;
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 4b08cdb721..a813da2b00 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5037,6 +5037,7 @@ examine_simple_variable(PlannerInfo *root, Var *var,
 		Query	   *subquery = rte->subquery;
 		RelOptInfo *rel;
 		TargetEntry *ste;
+		PlannerInfo *subroot;
 
 		/*
 		 * Punt if it's a whole-row var rather than a plain column reference.
@@ -5065,10 +5066,24 @@ examine_simple_variable(PlannerInfo *root, Var *var,
 		 */
 		rel = find_base_rel(root, var->varno);
 
-		/* If the subquery hasn't been planned yet, we have to punt */
-		if (rel->subroot == NULL)
+		if (rel->chosen_plan)
+			subroot = rel->chosen_plan;
+		else if (rel->pathlist && IsA(linitial(rel->pathlist), SubqueryScanPath))
+		{
+			/*
+			 * Use the estimates from the first path. XXX: what if it's a parameterized
+			 * path?
+			 */
+			SubqueryScanPath *sqpath = (SubqueryScanPath *) linitial(rel->pathlist);
+
+			subroot = sqpath->subroot;
+		}
+		else
+		{
+			/* If the subquery hasn't been planned yet, we have to punt */
 			return;
-		Assert(IsA(rel->subroot, PlannerInfo));
+		}
+		Assert(IsA(subroot, PlannerInfo));
 
 		/*
 		 * Switch our attention to the subquery as mangled by the planner. It
@@ -5078,7 +5093,7 @@ examine_simple_variable(PlannerInfo *root, Var *var,
 		 * planning, Vars in the targetlist might have gotten replaced, and we
 		 * need to see the replacement expressions.
 		 */
-		subquery = rel->subroot->parse;
+		subquery = subroot->parse;
 		Assert(IsA(subquery, Query));
 
 		/* Get the subquery output expression referenced by the upper Var */
@@ -5130,7 +5145,7 @@ examine_simple_variable(PlannerInfo *root, Var *var,
 			 * if the underlying column is unique, the subquery may have
 			 * joined to other tables in a way that creates duplicates.
 			 */
-			examine_simple_variable(rel->subroot, var, vardata);
+			examine_simple_variable(subroot, var, vardata);
 		}
 	}
 	else
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index de942b4e40..aca74a4ece 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -651,8 +651,7 @@ typedef struct RelOptInfo
 	BlockNumber pages;			/* size estimates derived from pg_class */
 	double		tuples;
 	double		allvisfrac;
-	PlannerInfo *subroot;		/* if subquery */
-	List	   *subplan_params; /* if subquery */
+	PlannerInfo *chosen_plan;
 	int			rel_parallel_workers;	/* wanted number of parallel workers */
 
 	/* Information about foreign tables and foreign joins */
@@ -1237,7 +1236,10 @@ typedef struct SubqueryScanPath
 {
 	Path		path;
 	Path	   *subpath;		/* path representing subquery execution */
-	List	   *pushed_down_ec_joins; /* pushed-down quals derived from ECs */
+	PlannerInfo *subroot;		/* */
+	List	   *subplan_params; /* */
+
+	List	   *pushed_down_clauses; /* pushed-down join quals */
 } SubqueryScanPath;
 
 /*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 77ca7ff837..82bb0f51a3 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -189,7 +189,7 @@ extern void set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
 						   RelOptInfo *inner_rel,
 						   SpecialJoinInfo *sjinfo,
 						   List *restrictlist);
-extern void set_subquery_size_estimates(PlannerInfo *root, RelOptInfo *rel);
+extern void set_subquery_size_estimates(PlannerInfo *root, RelOptInfo *rel, PlannerInfo *subroot);
 extern void set_function_size_estimates(PlannerInfo *root, RelOptInfo *rel);
 extern void set_values_size_estimates(PlannerInfo *root, RelOptInfo *rel);
 extern void set_cte_size_estimates(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 848839ed15..596815ecfc 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -91,8 +91,11 @@ extern GatherMergePath *create_gather_merge_path(PlannerInfo *root,
 						 Relids required_outer,
 						 double *rows);
 extern SubqueryScanPath *create_subqueryscan_path(PlannerInfo *root,
-						 RelOptInfo *rel, Path *subpath,
-						 List *pathkeys, Relids required_outer, List *pushed_down_ec_joins);
+						 RelOptInfo *rel,
+						 PlannerInfo *subroot,
+						 List *subplan_params,
+						 Path *subpath,
+						 List *pathkeys, Relids required_outer, List *pushed_down_clauses);
 extern Path *create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,
 						 List *pathkeys, Relids required_outer);
 extern Path *create_tablexprscan_path(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/test/regress/expected/subselect_pushdown.out b/src/test/regress/expected/subselect_pushdown.out
index f160bc84c8..792f66e7fd 100644
--- a/src/test/regress/expected/subselect_pushdown.out
+++ b/src/test/regress/expected/subselect_pushdown.out
@@ -25,7 +25,7 @@ where smalltab.i = subq.i and smalltab.i = 123;
                Index Cond: (i = 123)
 (7 rows)
 
--- Join quals are not currently pushed down
+-- Push down join quals.
 explain (costs off)
 select * from smalltab,
 (
@@ -36,18 +36,17 @@ select * from smalltab,
 where smalltab.i = subq.i;
                    QUERY PLAN                    
 -------------------------------------------------
- Merge Join
-   Merge Cond: (smalltab.i = bigtab.i)
-   ->  Sort
-         Sort Key: smalltab.i
-         ->  Seq Scan on smalltab
+ Nested Loop
+   ->  Seq Scan on smalltab
    ->  GroupAggregate
          Group Key: bigtab.i
          ->  Index Scan using bigtab_i on bigtab
-(8 rows)
+               Index Cond: (smalltab.i = i)
+(6 rows)
 
--- Except when the subquery is LATERAL, and already references the other relation.
--- Such join clauses can be pushed down.
+-- Subquery is LATERAL, and already references the other relation. The join
+-- qual is always pushed down in that case, as the plan is "parameterized"
+-- in respect to the other relation even if it was not pushed down.
 explain (costs off)
 select * from smalltab,
 lateral (
diff --git a/src/test/regress/sql/subselect_pushdown.sql b/src/test/regress/sql/subselect_pushdown.sql
index c9e7a43794..d6431d314c 100644
--- a/src/test/regress/sql/subselect_pushdown.sql
+++ b/src/test/regress/sql/subselect_pushdown.sql
@@ -20,7 +20,7 @@ select * from smalltab,
 ) as subq(i, avg)
 where smalltab.i = subq.i and smalltab.i = 123;
 
--- Join quals are not currently pushed down
+-- Push down join quals.
 explain (costs off)
 select * from smalltab,
 (
@@ -30,8 +30,9 @@ select * from smalltab,
 ) as subq(i, avg)
 where smalltab.i = subq.i;
 
--- Except when the subquery is LATERAL, and already references the other relation.
--- Such join clauses can be pushed down.
+-- Subquery is LATERAL, and already references the other relation. The join
+-- qual is always pushed down in that case, as the plan is "parameterized"
+-- in respect to the other relation even if it was not pushed down.
 explain (costs off)
 select * from smalltab,
 lateral (
-- 
2.11.0

Reply via email to