On Fri, 21 Feb 2025 at 06:16, Richard Guo <guofengli...@gmail.com> wrote:
>
> Yeah, it's annoying that the two replace_rte_variables callbacks have
> so much code duplication.  I think it's a win to make them share
> common code.  What do you think about making this refactor a separate
> patch, as it doesn't seem directly related to the bug fix here?

OK. Makes sense.

> * In pullup_replace_vars_callback, the varlevelsup of the newnode is
> adjusted before its nullingrels is updated.  This can cause problems.
> If the newnode is not a Var/PHV, we adjust its nullingrels with
> add_nulling_relids, and this function only works for level-zero vars.
> As a result, we may fail to put the varnullingrels into the
> expression.
>
> I think we should insist that ReplaceVarFromTargetList generates the
> replacement expression with varlevelsup = 0, and that the caller is
> responsible for adjusting the varlevelsup if needed.  This may need
> some changes to ReplaceVarsFromTargetList_callback too.

Ah, nice catch. Yes, that makes sense.

> * When expanding whole-tuple references, it is possible that some
> fields are expanded as Consts rather than Vars, considering dropped
> columns.  I think we need to check for this when generating the fields
> for a RowExpr.

Yes, good point.

> * The expansion of virtual generated columns occurs after subquery
> pullup, which can lead to issues.  This was an oversight on my part.
> Initially, I believed it wasn't possible for an RTE_RELATION RTE to
> have 'lateral' set to true, so I assumed it would be safe to expand
> virtual generated columns after subquery pullup.  However, upon closer
> look, this doesn't seem to be the case: if a subquery had a LATERAL
> marker, that would be propagated to any of its child RTEs, even for
> RTE_RELATION child RTE if this child rel has sampling info (see
> pull_up_simple_subquery).

Ah yes. That matches my initial instinct, which was to expand virtual
generated columns early in the planning process, but I didn't properly
understand why that was necessary.

> * Not an issue but I think that maybe we can share some common code
> between expand_virtual_generated_columns and
> expand_generated_columns_internal on how we build the generation
> expressions for a virtual generated column.

Agreed. I had planned to do that, but ran out of steam.

> I've worked on these issues and attached are the updated patches.
> 0001 expands virtual generated columns in the planner.  0002 refactors
> the code to eliminate code duplication in the replace_rte_variables
> callback functions.

LGTM aside from a comment in fireRIRrules() that needed updating and a
minor issue in the callback function: when deciding whether to wrap
newnode in a ReturningExpr, if newnode is a Var, it should now compare
its varlevelsup with 0, not var->varlevelsup, since newnode hasn't had
its varlevelsup adjusted at that point. This is only a minor point,
because I don't think we ever currently need to wrap a newnode Var due
to differing varlevelsup, so all that was happening was that it was
wrapping when it didn't need to, which is actually harmless aside from
a small runtime performance hit.

Given that we're moving this part of expanding virtual generated
columns to the planner, I wonder if we should also move the other bits
(build_generation_expression and expand_generated_columns_in_expr)
too, so that they're all together. That could be a follow-on patch.

Regards,
Dean
From c7b65fa591ea2d7bd7bacf823f20650915fff089 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofengli...@gmail.com>
Date: Fri, 21 Feb 2025 09:46:59 +0900
Subject: [PATCH v6 1/2] Expand virtual generated columns in the planner

Commit 83ea6c540 added support for virtual generated columns, which
were expanded in the rewriter, in a similar way to view columns.
However, this approach has several issues.  If a Var referencing a
virtual generated column has any varnullingrels, there would be no way
to propagate the varnullingrels into the generation expression,
leading to "wrong varnullingrels" errors and improper outer-join
removal.  Additionally, if such a Var comes from the nullable side of
an outer join, we may need to wrap the generation expression in a
PlaceHolderVar to ensure that it is evaluated at the right place and
hence is forced to null when the outer join should do so.  In some
cases, such as when the query uses grouping sets, we also need a
PlaceHolderVar for anything that's not a simple Var to isolate
subexpressions.  All of this cannot be achieved in the rewriter.

To fix this, the patch expands the virtual generated columns in the
planner and leverages the pullup_replace_vars architecture to avoid
code duplication.  This requires handling the OLD/NEW RETURNING list
Vars in pullup_replace_vars_callback.
---
 src/backend/optimizer/plan/planner.c          |   7 +
 src/backend/optimizer/prep/prepjointree.c     | 186 ++++++++++++++++++
 src/backend/rewrite/rewriteHandler.c          |  91 ++++-----
 src/backend/rewrite/rewriteManip.c            |   2 +-
 src/include/optimizer/prep.h                  |   1 +
 src/include/rewrite/rewriteHandler.h          |   1 +
 src/include/rewrite/rewriteManip.h            |   3 +
 .../regress/expected/generated_virtual.out    |  89 +++++++++
 src/test/regress/sql/generated_virtual.sql    |  38 ++++
 9 files changed, 372 insertions(+), 46 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 7b1a8a0a9f1..201487eaf42 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -717,6 +717,13 @@ subquery_planner(PlannerGlobal *glob, Query *parse, PlannerInfo *parent_root,
 	 */
 	replace_empty_jointree(parse);
 
+	/*
+	 * Scan the rangetable for relations with virtual generated columns, and
+	 * replace all Var nodes in the query that reference these columns with
+	 * the generation expressions.
+	 */
+	parse = root->parse = expand_virtual_generated_columns(root);
+
 	/*
 	 * Look for ANY and EXISTS SubLinks in WHERE and JOIN/ON clauses, and try
 	 * to transform them into joins.  Note that this step does not descend
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 5d9225e9909..3125567846f 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -5,6 +5,7 @@
  *
  * NOTE: the intended sequence for invoking these operations is
  *		replace_empty_jointree
+ *		expand_virtual_generated_columns
  *		pull_up_sublinks
  *		preprocess_function_rtes
  *		pull_up_subqueries
@@ -25,6 +26,7 @@
  */
 #include "postgres.h"
 
+#include "access/table.h"
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -39,7 +41,9 @@
 #include "optimizer/tlist.h"
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
+#include "rewrite/rewriteHandler.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/rel.h"
 
 
 typedef struct nullingrel_info
@@ -57,6 +61,7 @@ typedef struct pullup_replace_vars_context
 {
 	PlannerInfo *root;
 	List	   *targetlist;		/* tlist of subquery being pulled up */
+	int			result_relation;
 	RangeTblEntry *target_rte;	/* RTE of subquery */
 	Relids		relids;			/* relids within subquery, as numbered after
 								 * pullup (set only if target_rte->lateral) */
@@ -421,6 +426,128 @@ replace_empty_jointree(Query *parse)
 	parse->jointree->fromlist = list_make1(rtr);
 }
 
+/*
+ * expand_virtual_generated_columns
+ *		Expand all virtual generated column references in a query.
+ *
+ * This scans the rangetable for relations with virtual generated columns, and
+ * replaces all Var nodes in the query that reference these columns with the
+ * appropriate expressions.  Note that we do not recurse into subqueries; that
+ * is taken care of when the subqueries are planned.
+ *
+ * Returns a modified copy of the query tree, if any relations with virtual
+ * generated columns are present.
+ */
+Query *
+expand_virtual_generated_columns(PlannerInfo *root)
+{
+	Query	   *parse = root->parse;
+	int			rt_index;
+	ListCell   *lc;
+
+	rt_index = 0;
+	foreach(lc, parse->rtable)
+	{
+		RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
+		Relation	rel;
+		TupleDesc	tupdesc;
+
+		++rt_index;
+
+		/*
+		 * Only normal relations can have virtual generated columns.
+		 */
+		if (rte->rtekind != RTE_RELATION)
+			continue;
+
+		rel = table_open(rte->relid, NoLock);
+
+		tupdesc = RelationGetDescr(rel);
+		if (tupdesc->constr && tupdesc->constr->has_generated_virtual)
+		{
+			List	   *tlist = NIL;
+			pullup_replace_vars_context rvcontext;
+
+			for (int i = 0; i < tupdesc->natts; i++)
+			{
+				Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
+				TargetEntry *tle;
+
+				if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
+				{
+					Node	   *defexpr = build_generation_expression(rel, i + 1);
+
+					ChangeVarNodes(defexpr, 1, rt_index, 0);
+
+					tle = makeTargetEntry((Expr *) defexpr, i + 1, 0, false);
+					tlist = lappend(tlist, tle);
+				}
+				else
+				{
+					Var		   *var;
+
+					var = makeVar(rt_index,
+								  i + 1,
+								  attr->atttypid,
+								  attr->atttypmod,
+								  attr->attcollation,
+								  0);
+
+					tle = makeTargetEntry((Expr *) var, i + 1, 0, false);
+					tlist = lappend(tlist, tle);
+				}
+			}
+
+			Assert(list_length(tlist) > 0);
+			Assert(!rte->lateral);
+
+			/*
+			 * The relation's targetlist items are now in the appropriate form
+			 * to insert into the query, except that we may need to wrap them
+			 * in PlaceHolderVars.  Set up required context data for
+			 * pullup_replace_vars.
+			 */
+			rvcontext.root = root;
+			rvcontext.targetlist = tlist;
+			rvcontext.result_relation = parse->resultRelation;
+			rvcontext.target_rte = rte;
+			/* won't need these values */
+			rvcontext.relids = NULL;
+			rvcontext.nullinfo = NULL;
+			/* pass NULL for outer_hasSubLinks */
+			rvcontext.outer_hasSubLinks = NULL;
+			rvcontext.varno = rt_index;
+			/* this flag will be set below, if needed */
+			rvcontext.wrap_non_vars = false;
+			/* initialize cache array with indexes 0 .. length(tlist) */
+			rvcontext.rv_cache = palloc0((list_length(tlist) + 1) *
+										 sizeof(Node *));
+
+			/*
+			 * If the query uses grouping sets, we need a PlaceHolderVar for
+			 * anything that's not a simple Var.  This ensures that
+			 * expressions retain their separate identity so that they will
+			 * match grouping set columns when appropriate.  (It'd be
+			 * sufficient to wrap values used in grouping set columns, and do
+			 * so only in non-aggregated portions of the tlist and havingQual,
+			 * but that would require a lot of infrastructure that
+			 * pullup_replace_vars hasn't currently got.)
+			 */
+			if (parse->groupingSets)
+				rvcontext.wrap_non_vars = true;
+
+			/*
+			 * Apply pullup variable replacement throughout the query tree.
+			 */
+			parse = (Query *) pullup_replace_vars((Node *) parse, &rvcontext);
+		}
+
+		table_close(rel, NoLock);
+	}
+
+	return parse;
+}
+
 /*
  * pull_up_sublinks
  *		Attempt to pull up ANY and EXISTS SubLinks to be treated as
@@ -1184,6 +1311,13 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
 	 */
 	replace_empty_jointree(subquery);
 
+	/*
+	 * Scan the rangetable for relations with virtual generated columns, and
+	 * replace all Var nodes in the query that reference these columns with
+	 * the generation expressions.
+	 */
+	subquery = subroot->parse = expand_virtual_generated_columns(subroot);
+
 	/*
 	 * Pull up any SubLinks within the subquery's quals, so that we don't
 	 * leave unoptimized SubLinks behind.
@@ -1273,6 +1407,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
 	 */
 	rvcontext.root = root;
 	rvcontext.targetlist = subquery->targetList;
+	rvcontext.result_relation = 0;
 	rvcontext.target_rte = rte;
 	if (rte->lateral)
 	{
@@ -1833,6 +1968,7 @@ pull_up_simple_values(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte)
 	}
 	rvcontext.root = root;
 	rvcontext.targetlist = tlist;
+	rvcontext.result_relation = 0;
 	rvcontext.target_rte = rte;
 	rvcontext.relids = NULL;	/* can't be any lateral references here */
 	rvcontext.nullinfo = NULL;
@@ -1993,6 +2129,7 @@ pull_up_constant_function(PlannerInfo *root, Node *jtnode,
 													  NULL, /* resname */
 													  false));	/* resjunk */
 	rvcontext.target_rte = rte;
+	rvcontext.result_relation = 0;
 
 	/*
 	 * Since this function was reduced to a Const, it doesn't contain any
@@ -2490,6 +2627,10 @@ pullup_replace_vars_callback(Var *var,
 	bool		need_phv;
 	Node	   *newnode;
 
+	/* System columns are not replaced. */
+	if (varattno < InvalidAttrNumber)
+		return (Node *) copyObject(var);
+
 	/*
 	 * We need a PlaceHolderVar if the Var-to-be-replaced has nonempty
 	 * varnullingrels (unless we find below that the replacement expression is
@@ -2559,6 +2700,18 @@ pullup_replace_vars_callback(Var *var,
 		rowexpr->location = var->location;
 		newnode = (Node *) rowexpr;
 
+		/* Handle any OLD/NEW RETURNING list Vars */
+		if (var->varreturningtype != VAR_RETURNING_DEFAULT)
+		{
+			ReturningExpr *rexpr = makeNode(ReturningExpr);
+
+			rexpr->retlevelsup = 0;
+			rexpr->retold = (var->varreturningtype == VAR_RETURNING_OLD);
+			rexpr->retexpr = (Expr *) newnode;
+
+			newnode = (Node *) rexpr;
+		}
+
 		/*
 		 * Insert PlaceHolderVar if needed.  Notice that we are wrapping one
 		 * PlaceHolderVar around the whole RowExpr, rather than putting one
@@ -2588,6 +2741,39 @@ pullup_replace_vars_callback(Var *var,
 		/* Make a copy of the tlist item to return */
 		newnode = (Node *) copyObject(tle->expr);
 
+		/* Handle any OLD/NEW RETURNING list Vars */
+		if (var->varreturningtype != VAR_RETURNING_DEFAULT)
+		{
+			/*
+			 * Copy varreturningtype onto any Vars in the tlist item that
+			 * refer to result_relation (which had better be non-zero).
+			 */
+			if (rcon->result_relation == 0)
+				elog(ERROR, "variable returning old/new found outside RETURNING list");
+
+			SetVarReturningType((Node *) newnode, rcon->result_relation,
+								0, var->varreturningtype);
+
+			/*
+			 * If the replacement expression in the targetlist is not simply a
+			 * Var referencing result_relation, wrap it in a ReturningExpr
+			 * node, so that the executor returns NULL if the OLD/NEW row does
+			 * not exist.
+			 */
+			if (!IsA(newnode, Var) ||
+				((Var *) newnode)->varno != rcon->result_relation ||
+				((Var *) newnode)->varlevelsup != 0)
+			{
+				ReturningExpr *rexpr = makeNode(ReturningExpr);
+
+				rexpr->retlevelsup = 0;
+				rexpr->retold = (var->varreturningtype == VAR_RETURNING_OLD);
+				rexpr->retexpr = (Expr *) newnode;
+
+				newnode = (Node *) rexpr;
+			}
+		}
+
 		/* Insert PlaceHolderVar if needed */
 		if (need_phv)
 		{
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index e996bdc0d21..1fa9f8cc55f 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -2190,10 +2190,6 @@ fireRIRrules(Query *parsetree, List *activeRIRs)
 	 * requires special recursion detection if the new quals have sublink
 	 * subqueries, and if we did it in the loop above query_tree_walker would
 	 * then recurse into those quals a second time.
-	 *
-	 * Finally, we expand any virtual generated columns.  We do this after
-	 * each table's RLS policies are applied because the RLS policies might
-	 * also refer to the table's virtual generated columns.
 	 */
 	rt_index = 0;
 	foreach(lc, parsetree->rtable)
@@ -2207,11 +2203,10 @@ fireRIRrules(Query *parsetree, List *activeRIRs)
 
 		++rt_index;
 
-		/*
-		 * Only normal relations can have RLS policies or virtual generated
-		 * columns.
-		 */
-		if (rte->rtekind != RTE_RELATION)
+		/* Only normal relations can have RLS policies */
+		if (rte->rtekind != RTE_RELATION ||
+			(rte->relkind != RELKIND_RELATION &&
+			 rte->relkind != RELKIND_PARTITIONED_TABLE))
 			continue;
 
 		rel = table_open(rte->relid, NoLock);
@@ -2300,16 +2295,6 @@ fireRIRrules(Query *parsetree, List *activeRIRs)
 		if (hasSubLinks)
 			parsetree->hasSubLinks = true;
 
-		/*
-		 * Expand any references to virtual generated columns of this table.
-		 * Note that subqueries in virtual generated column expressions are
-		 * not currently supported, so this cannot add any more sublinks.
-		 */
-		parsetree = (Query *)
-			expand_generated_columns_internal((Node *) parsetree,
-											  rel, rt_index, rte,
-											  parsetree->resultRelation);
-
 		table_close(rel, NoLock);
 	}
 
@@ -4456,36 +4441,12 @@ expand_generated_columns_internal(Node *node, Relation rel, int rt_index,
 
 			if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
 			{
-				Node	   *defexpr;
-				int			attnum = i + 1;
-				Oid			attcollid;
 				TargetEntry *te;
-
-				defexpr = build_column_default(rel, attnum);
-				if (defexpr == NULL)
-					elog(ERROR, "no generation expression found for column number %d of table \"%s\"",
-						 attnum, RelationGetRelationName(rel));
-
-				/*
-				 * If the column definition has a collation and it is
-				 * different from the collation of the generation expression,
-				 * put a COLLATE clause around the expression.
-				 */
-				attcollid = attr->attcollation;
-				if (attcollid && attcollid != exprCollation(defexpr))
-				{
-					CollateExpr *ce = makeNode(CollateExpr);
-
-					ce->arg = (Expr *) defexpr;
-					ce->collOid = attcollid;
-					ce->location = -1;
-
-					defexpr = (Node *) ce;
-				}
+				Node	   *defexpr = build_generation_expression(rel, i + 1);
 
 				ChangeVarNodes(defexpr, 1, rt_index, 0);
 
-				te = makeTargetEntry((Expr *) defexpr, attnum, 0, false);
+				te = makeTargetEntry((Expr *) defexpr, i + 1, 0, false);
 				tlist = lappend(tlist, te);
 			}
 		}
@@ -4528,6 +4489,46 @@ expand_generated_columns_in_expr(Node *node, Relation rel, int rt_index)
 	return node;
 }
 
+/*
+ * Build the generation expression for the virtual generated column.
+ *
+ * Error out if there is no generation expression found for the given column.
+ */
+Node *
+build_generation_expression(Relation rel, int attrno)
+{
+	TupleDesc	rd_att = RelationGetDescr(rel);
+	Form_pg_attribute att_tup = TupleDescAttr(rd_att, attrno - 1);
+	Node	   *defexpr;
+	Oid			attcollid;
+
+	Assert(att_tup->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL);
+
+	defexpr = build_column_default(rel, attrno);
+	if (defexpr == NULL)
+		elog(ERROR, "no generation expression found for column number %d of table \"%s\"",
+			 attrno, RelationGetRelationName(rel));
+
+	/*
+	 * If the column definition has a collation and it is different from the
+	 * collation of the generation expression, put a COLLATE clause around the
+	 * expression.
+	 */
+	attcollid = att_tup->attcollation;
+	if (attcollid && attcollid != exprCollation(defexpr))
+	{
+		CollateExpr *ce = makeNode(CollateExpr);
+
+		ce->arg = (Expr *) defexpr;
+		ce->collOid = attcollid;
+		ce->location = -1;
+
+		defexpr = (Node *) ce;
+	}
+
+	return defexpr;
+}
+
 
 /*
  * QueryRewrite -
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 9433548d279..6994b8c5425 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -1010,7 +1010,7 @@ SetVarReturningType_walker(Node *node, SetVarReturningType_context *context)
 	return expression_tree_walker(node, SetVarReturningType_walker, context);
 }
 
-static void
+void
 SetVarReturningType(Node *node, int result_relation, int sublevels_up,
 					VarReturningType returning_type)
 {
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 0ae57ec24a4..0a61cd126b7 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -23,6 +23,7 @@
  */
 extern void transform_MERGE_to_join(Query *parse);
 extern void replace_empty_jointree(Query *parse);
+extern Query *expand_virtual_generated_columns(PlannerInfo *root);
 extern void pull_up_sublinks(PlannerInfo *root);
 extern void preprocess_function_rtes(PlannerInfo *root);
 extern void pull_up_subqueries(PlannerInfo *root);
diff --git a/src/include/rewrite/rewriteHandler.h b/src/include/rewrite/rewriteHandler.h
index 88fe13c5f4f..99cab1a3bfa 100644
--- a/src/include/rewrite/rewriteHandler.h
+++ b/src/include/rewrite/rewriteHandler.h
@@ -39,5 +39,6 @@ extern void error_view_not_updatable(Relation view,
 									 const char *detail);
 
 extern Node *expand_generated_columns_in_expr(Node *node, Relation rel, int rt_index);
+extern Node *build_generation_expression(Relation rel, int attrno);
 
 #endif							/* REWRITEHANDLER_H */
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index 5ec475c63e9..466edd7c1c2 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -55,6 +55,9 @@ extern void IncrementVarSublevelsUp(Node *node, int delta_sublevels_up,
 extern void IncrementVarSublevelsUp_rtable(List *rtable,
 										   int delta_sublevels_up, int min_sublevels_up);
 
+extern void SetVarReturningType(Node *node, int result_relation, int sublevels_up,
+								VarReturningType returning_type);
+
 extern bool rangeTableEntry_used(Node *node, int rt_index,
 								 int sublevels_up);
 
diff --git a/src/test/regress/expected/generated_virtual.out b/src/test/regress/expected/generated_virtual.out
index 35638812be9..cb2383469c6 100644
--- a/src/test/regress/expected/generated_virtual.out
+++ b/src/test/regress/expected/generated_virtual.out
@@ -1398,3 +1398,92 @@ SELECT attrelid, attname, attgenerated FROM pg_attribute WHERE attgenerated NOT
 ----------+---------+--------------
 (0 rows)
 
+--
+-- test the expansion of virtual generated columns
+--
+create table gtest32 (
+  a int,
+  b int generated always as (a * 2),
+  c int generated always as (10 + 10),
+  d int generated always as (coalesce(a, 100))
+);
+insert into gtest32 values (1), (2);
+-- Ensure that nullingrel bits are propagated into the generation expression
+explain (costs off)
+select sum(t2.b) over (partition by t2.a),
+       sum(t2.c) over (partition by t2.a),
+       sum(t2.d) over (partition by t2.a)
+from gtest32 as t1 left join gtest32 as t2 on (t1.a = t2.a)
+order by t1.a;
+                      QUERY PLAN                      
+------------------------------------------------------
+ Sort
+   Sort Key: t1.a
+   ->  WindowAgg
+         ->  Sort
+               Sort Key: t2.a
+               ->  Merge Left Join
+                     Merge Cond: (t1.a = t2.a)
+                     ->  Sort
+                           Sort Key: t1.a
+                           ->  Seq Scan on gtest32 t1
+                     ->  Sort
+                           Sort Key: t2.a
+                           ->  Seq Scan on gtest32 t2
+(13 rows)
+
+select sum(t2.b) over (partition by t2.a),
+       sum(t2.c) over (partition by t2.a),
+       sum(t2.d) over (partition by t2.a)
+from gtest32 as t1 left join gtest32 as t2 on (t1.a = t2.a)
+order by t1.a;
+ sum | sum | sum 
+-----+-----+-----
+   2 |  20 |   1
+   4 |  20 |   2
+(2 rows)
+
+-- Ensure that the generation expressions are wrapped into PHVs if needed
+explain (verbose, costs off)
+select t2.* from gtest32 t1 left join gtest32 t2 on false;
+                      QUERY PLAN                      
+------------------------------------------------------
+ Nested Loop Left Join
+   Output: a, (a * 2), (20), (COALESCE(a, 100))
+   Join Filter: false
+   ->  Seq Scan on generated_virtual_tests.gtest32 t1
+         Output: t1.a, t1.b, t1.c, t1.d
+   ->  Result
+         Output: a, 20, COALESCE(a, 100)
+         One-Time Filter: false
+(8 rows)
+
+select t2.* from gtest32 t1 left join gtest32 t2 on false;
+ a | b | c | d 
+---+---+---+---
+   |   |   |  
+   |   |   |  
+(2 rows)
+
+explain (verbose, costs off)
+select * from gtest32 t group by grouping sets (a, b, c, d) having c = 20;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ HashAggregate
+   Output: a, ((a * 2)), (20), (COALESCE(a, 100))
+   Hash Key: t.a
+   Hash Key: (t.a * 2)
+   Hash Key: 20
+   Hash Key: COALESCE(t.a, 100)
+   Filter: ((20) = 20)
+   ->  Seq Scan on generated_virtual_tests.gtest32 t
+         Output: a, (a * 2), 20, COALESCE(a, 100)
+(9 rows)
+
+select * from gtest32 t group by grouping sets (a, b, c, d) having c = 20;
+ a | b | c  | d 
+---+---+----+---
+   |   | 20 |  
+(1 row)
+
+drop table gtest32;
diff --git a/src/test/regress/sql/generated_virtual.sql b/src/test/regress/sql/generated_virtual.sql
index 34870813910..25b28a83b1b 100644
--- a/src/test/regress/sql/generated_virtual.sql
+++ b/src/test/regress/sql/generated_virtual.sql
@@ -732,3 +732,41 @@ CREATE TABLE gtest28b (LIKE gtest28a INCLUDING GENERATED);
 
 -- sanity check of system catalog
 SELECT attrelid, attname, attgenerated FROM pg_attribute WHERE attgenerated NOT IN ('', 's', 'v');
+
+--
+-- test the expansion of virtual generated columns
+--
+
+create table gtest32 (
+  a int,
+  b int generated always as (a * 2),
+  c int generated always as (10 + 10),
+  d int generated always as (coalesce(a, 100))
+);
+
+insert into gtest32 values (1), (2);
+
+-- Ensure that nullingrel bits are propagated into the generation expression
+explain (costs off)
+select sum(t2.b) over (partition by t2.a),
+       sum(t2.c) over (partition by t2.a),
+       sum(t2.d) over (partition by t2.a)
+from gtest32 as t1 left join gtest32 as t2 on (t1.a = t2.a)
+order by t1.a;
+
+select sum(t2.b) over (partition by t2.a),
+       sum(t2.c) over (partition by t2.a),
+       sum(t2.d) over (partition by t2.a)
+from gtest32 as t1 left join gtest32 as t2 on (t1.a = t2.a)
+order by t1.a;
+
+-- Ensure that the generation expressions are wrapped into PHVs if needed
+explain (verbose, costs off)
+select t2.* from gtest32 t1 left join gtest32 t2 on false;
+select t2.* from gtest32 t1 left join gtest32 t2 on false;
+
+explain (verbose, costs off)
+select * from gtest32 t group by grouping sets (a, b, c, d) having c = 20;
+select * from gtest32 t group by grouping sets (a, b, c, d) having c = 20;
+
+drop table gtest32;
-- 
2.43.0

From 4643a1832ba30c5c6ddee4643676045675f65e28 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofengli...@gmail.com>
Date: Fri, 21 Feb 2025 14:01:56 +0900
Subject: [PATCH v6 2/2] Eliminate code duplication in replace_rte_variables
 callbacks

The callback functions ReplaceVarsFromTargetList_callback and
pullup_replace_vars_callback are both used to replace Vars in an
expression tree that reference a particular RTE with items from a
targetlist, and they both need to expand whole-tuple reference and
deal with OLD/NEW RETURNING list Vars.  As a result, currently there
is significant code duplication between these two functions.

This patch introduces a new function, ReplaceVarFromTargetList, to
perform the replacement and calls it from both callback functions,
thereby eliminating code duplication.
---
 src/backend/optimizer/prep/prepjointree.c | 137 ++++------------------
 src/backend/rewrite/rewriteManip.c        |  83 +++++++++----
 src/include/rewrite/rewriteManip.h        |   9 +-
 3 files changed, 91 insertions(+), 138 deletions(-)

diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 3125567846f..367404735b8 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -2660,127 +2660,38 @@ pullup_replace_vars_callback(Var *var,
 		/* Just copy the entry and fall through to adjust phlevelsup etc */
 		newnode = copyObject(rcon->rv_cache[varattno]);
 	}
-	else if (varattno == InvalidAttrNumber)
+	else
 	{
-		/* Must expand whole-tuple reference into RowExpr */
-		RowExpr    *rowexpr;
-		List	   *colnames;
-		List	   *fields;
-		bool		save_wrap_non_vars = rcon->wrap_non_vars;
-		int			save_sublevelsup = context->sublevels_up;
-
-		/*
-		 * If generating an expansion for a var of a named rowtype (ie, this
-		 * is a plain relation RTE), then we must include dummy items for
-		 * dropped columns.  If the var is RECORD (ie, this is a JOIN), then
-		 * omit dropped columns.  In the latter case, attach column names to
-		 * the RowExpr for use of the executor and ruleutils.c.
-		 *
-		 * In order to be able to cache the results, we always generate the
-		 * expansion with varlevelsup = 0, and then adjust below if needed.
-		 */
-		expandRTE(rcon->target_rte,
-				  var->varno, 0 /* not varlevelsup */ ,
-				  var->varreturningtype, var->location,
-				  (var->vartype != RECORDOID),
-				  &colnames, &fields);
-		/* Expand the generated per-field Vars, but don't insert PHVs there */
-		rcon->wrap_non_vars = false;
-		context->sublevels_up = 0;	/* to match the expandRTE output */
-		fields = (List *) replace_rte_variables_mutator((Node *) fields,
-														context);
-		rcon->wrap_non_vars = save_wrap_non_vars;
-		context->sublevels_up = save_sublevelsup;
-
-		rowexpr = makeNode(RowExpr);
-		rowexpr->args = fields;
-		rowexpr->row_typeid = var->vartype;
-		rowexpr->row_format = COERCE_IMPLICIT_CAST;
-		rowexpr->colnames = (var->vartype == RECORDOID) ? colnames : NIL;
-		rowexpr->location = var->location;
-		newnode = (Node *) rowexpr;
-
-		/* Handle any OLD/NEW RETURNING list Vars */
-		if (var->varreturningtype != VAR_RETURNING_DEFAULT)
-		{
-			ReturningExpr *rexpr = makeNode(ReturningExpr);
-
-			rexpr->retlevelsup = 0;
-			rexpr->retold = (var->varreturningtype == VAR_RETURNING_OLD);
-			rexpr->retexpr = (Expr *) newnode;
-
-			newnode = (Node *) rexpr;
-		}
-
 		/*
-		 * Insert PlaceHolderVar if needed.  Notice that we are wrapping one
-		 * PlaceHolderVar around the whole RowExpr, rather than putting one
-		 * around each element of the row.  This is because we need the
-		 * expression to yield NULL, not ROW(NULL,NULL,...) when it is forced
-		 * to null by an outer join.
+		 * Generate the replacement expression.  This takes care of expanding
+		 * wholerow references and dealing with non-default varreturningtype.
 		 */
-		if (need_phv)
-		{
-			newnode = (Node *)
-				make_placeholder_expr(rcon->root,
-									  (Expr *) newnode,
-									  bms_make_singleton(rcon->varno));
-			/* cache it with the PHV, and with phlevelsup etc not set yet */
-			rcon->rv_cache[InvalidAttrNumber] = copyObject(newnode);
-		}
-	}
-	else
-	{
-		/* Normal case referencing one targetlist element */
-		TargetEntry *tle = get_tle_by_resno(rcon->targetlist, varattno);
-
-		if (tle == NULL)		/* shouldn't happen */
-			elog(ERROR, "could not find attribute %d in subquery targetlist",
-				 varattno);
-
-		/* Make a copy of the tlist item to return */
-		newnode = (Node *) copyObject(tle->expr);
-
-		/* Handle any OLD/NEW RETURNING list Vars */
-		if (var->varreturningtype != VAR_RETURNING_DEFAULT)
-		{
-			/*
-			 * Copy varreturningtype onto any Vars in the tlist item that
-			 * refer to result_relation (which had better be non-zero).
-			 */
-			if (rcon->result_relation == 0)
-				elog(ERROR, "variable returning old/new found outside RETURNING list");
-
-			SetVarReturningType((Node *) newnode, rcon->result_relation,
-								0, var->varreturningtype);
-
-			/*
-			 * If the replacement expression in the targetlist is not simply a
-			 * Var referencing result_relation, wrap it in a ReturningExpr
-			 * node, so that the executor returns NULL if the OLD/NEW row does
-			 * not exist.
-			 */
-			if (!IsA(newnode, Var) ||
-				((Var *) newnode)->varno != rcon->result_relation ||
-				((Var *) newnode)->varlevelsup != 0)
-			{
-				ReturningExpr *rexpr = makeNode(ReturningExpr);
-
-				rexpr->retlevelsup = 0;
-				rexpr->retold = (var->varreturningtype == VAR_RETURNING_OLD);
-				rexpr->retexpr = (Expr *) newnode;
-
-				newnode = (Node *) rexpr;
-			}
-		}
+		newnode = ReplaceVarFromTargetList(var,
+										   rcon->target_rte,
+										   rcon->targetlist,
+										   rcon->result_relation,
+										   REPLACEVARS_REPORT_ERROR,
+										   0);
 
 		/* Insert PlaceHolderVar if needed */
 		if (need_phv)
 		{
 			bool		wrap;
 
-			if (newnode && IsA(newnode, Var) &&
-				((Var *) newnode)->varlevelsup == 0)
+			if (varattno == InvalidAttrNumber)
+			{
+				/*
+				 * Insert PlaceHolderVar for whole-tuple reference.  Notice
+				 * that we are wrapping one PlaceHolderVar around the whole
+				 * RowExpr, rather than putting one around each element of the
+				 * row.  This is because we need the expression to yield NULL,
+				 * not ROW(NULL,NULL,...) when it is forced to null by an
+				 * outer join.
+				 */
+				wrap = true;
+			}
+			else if (newnode && IsA(newnode, Var) &&
+					 ((Var *) newnode)->varlevelsup == 0)
 			{
 				/*
 				 * Simple Vars always escape being wrapped, unless they are
@@ -2926,7 +2837,7 @@ pullup_replace_vars_callback(Var *var,
 				 * Cache it if possible (ie, if the attno is in range, which
 				 * it probably always should be).
 				 */
-				if (varattno > InvalidAttrNumber &&
+				if (varattno >= InvalidAttrNumber &&
 					varattno <= list_length(rcon->targetlist))
 					rcon->rv_cache[varattno] = copyObject(newnode);
 			}
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 6994b8c5425..6e6ea76a664 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -1010,7 +1010,7 @@ SetVarReturningType_walker(Node *node, SetVarReturningType_context *context)
 	return expression_tree_walker(node, SetVarReturningType_walker, context);
 }
 
-void
+static void
 SetVarReturningType(Node *node, int result_relation, int sublevels_up,
 					VarReturningType returning_type)
 {
@@ -1814,6 +1814,30 @@ ReplaceVarsFromTargetList_callback(Var *var,
 								   replace_rte_variables_context *context)
 {
 	ReplaceVarsFromTargetList_context *rcon = (ReplaceVarsFromTargetList_context *) context->callback_arg;
+	Node	   *newnode;
+
+	newnode = ReplaceVarFromTargetList(var,
+									   rcon->target_rte,
+									   rcon->targetlist,
+									   rcon->result_relation,
+									   rcon->nomatch_option,
+									   rcon->nomatch_varno);
+
+	/* Must adjust varlevelsup if replaced Var is within a subquery */
+	if (var->varlevelsup > 0)
+		IncrementVarSublevelsUp(newnode, var->varlevelsup, 0);
+
+	return newnode;
+}
+
+Node *
+ReplaceVarFromTargetList(Var *var,
+						 RangeTblEntry *target_rte,
+						 List *targetlist,
+						 int result_relation,
+						 ReplaceVarsNoMatchOption nomatch_option,
+						 int nomatch_varno)
+{
 	TargetEntry *tle;
 
 	if (var->varattno == InvalidAttrNumber)
@@ -1822,6 +1846,7 @@ ReplaceVarsFromTargetList_callback(Var *var,
 		RowExpr    *rowexpr;
 		List	   *colnames;
 		List	   *fields;
+		ListCell   *lc;
 
 		/*
 		 * If generating an expansion for a var of a named rowtype (ie, this
@@ -1830,29 +1855,46 @@ ReplaceVarsFromTargetList_callback(Var *var,
 		 * omit dropped columns.  In the latter case, attach column names to
 		 * the RowExpr for use of the executor and ruleutils.c.
 		 *
+		 * In order to be able to cache the results, we always generate the
+		 * expansion with varlevelsup = 0.  The caller is responsible for
+		 * adjusting it if needed.
+		 *
 		 * The varreturningtype is copied onto each individual field Var, so
 		 * that it is handled correctly when we recurse.
 		 */
-		expandRTE(rcon->target_rte,
-				  var->varno, var->varlevelsup, var->varreturningtype,
-				  var->location, (var->vartype != RECORDOID),
+		expandRTE(target_rte,
+				  var->varno, 0 /* not varlevelsup */ ,
+				  var->varreturningtype, var->location,
+				  (var->vartype != RECORDOID),
 				  &colnames, &fields);
-		/* Adjust the generated per-field Vars... */
-		fields = (List *) replace_rte_variables_mutator((Node *) fields,
-														context);
 		rowexpr = makeNode(RowExpr);
-		rowexpr->args = fields;
+		/* the fields will be set below */
+		rowexpr->args = NIL;
 		rowexpr->row_typeid = var->vartype;
 		rowexpr->row_format = COERCE_IMPLICIT_CAST;
 		rowexpr->colnames = (var->vartype == RECORDOID) ? colnames : NIL;
 		rowexpr->location = var->location;
+		/* Adjust the generated per-field Vars... */
+		foreach(lc, fields)
+		{
+			Node	   *field = lfirst(lc);
+
+			if (field && IsA(field, Var))
+				field = ReplaceVarFromTargetList((Var *) field,
+												 target_rte,
+												 targetlist,
+												 result_relation,
+												 nomatch_option,
+												 nomatch_varno);
+			rowexpr->args = lappend(rowexpr->args, field);
+		}
 
 		/* Wrap it in a ReturningExpr, if needed, per comments above */
 		if (var->varreturningtype != VAR_RETURNING_DEFAULT)
 		{
 			ReturningExpr *rexpr = makeNode(ReturningExpr);
 
-			rexpr->retlevelsup = var->varlevelsup;
+			rexpr->retlevelsup = 0;
 			rexpr->retold = (var->varreturningtype == VAR_RETURNING_OLD);
 			rexpr->retexpr = (Expr *) rowexpr;
 
@@ -1863,12 +1905,12 @@ ReplaceVarsFromTargetList_callback(Var *var,
 	}
 
 	/* Normal case referencing one targetlist element */
-	tle = get_tle_by_resno(rcon->targetlist, var->varattno);
+	tle = get_tle_by_resno(targetlist, var->varattno);
 
 	if (tle == NULL || tle->resjunk)
 	{
 		/* Failed to find column in targetlist */
-		switch (rcon->nomatch_option)
+		switch (nomatch_option)
 		{
 			case REPLACEVARS_REPORT_ERROR:
 				/* fall through, throw error below */
@@ -1876,7 +1918,8 @@ ReplaceVarsFromTargetList_callback(Var *var,
 
 			case REPLACEVARS_CHANGE_VARNO:
 				var = copyObject(var);
-				var->varno = rcon->nomatch_varno;
+				var->varno = nomatch_varno;
+				var->varlevelsup = 0;
 				/* we leave the syntactic referent alone */
 				return (Node *) var;
 
@@ -1906,10 +1949,6 @@ ReplaceVarsFromTargetList_callback(Var *var,
 		/* Make a copy of the tlist item to return */
 		Expr	   *newnode = copyObject(tle->expr);
 
-		/* Must adjust varlevelsup if tlist item is from higher query */
-		if (var->varlevelsup > 0)
-			IncrementVarSublevelsUp((Node *) newnode, var->varlevelsup, 0);
-
 		/*
 		 * Check to see if the tlist item contains a PARAM_MULTIEXPR Param,
 		 * and throw error if so.  This case could only happen when expanding
@@ -1932,20 +1971,20 @@ ReplaceVarsFromTargetList_callback(Var *var,
 			 * Copy varreturningtype onto any Vars in the tlist item that
 			 * refer to result_relation (which had better be non-zero).
 			 */
-			if (rcon->result_relation == 0)
+			if (result_relation == 0)
 				elog(ERROR, "variable returning old/new found outside RETURNING list");
 
-			SetVarReturningType((Node *) newnode, rcon->result_relation,
-								var->varlevelsup, var->varreturningtype);
+			SetVarReturningType((Node *) newnode, result_relation,
+								0, var->varreturningtype);
 
 			/* Wrap it in a ReturningExpr, if needed, per comments above */
 			if (!IsA(newnode, Var) ||
-				((Var *) newnode)->varno != rcon->result_relation ||
-				((Var *) newnode)->varlevelsup != var->varlevelsup)
+				((Var *) newnode)->varno != result_relation ||
+				((Var *) newnode)->varlevelsup != 0)
 			{
 				ReturningExpr *rexpr = makeNode(ReturningExpr);
 
-				rexpr->retlevelsup = var->varlevelsup;
+				rexpr->retlevelsup = 0;
 				rexpr->retold = (var->varreturningtype == VAR_RETURNING_OLD);
 				rexpr->retexpr = newnode;
 
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index 466edd7c1c2..ea3908739c6 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -55,9 +55,6 @@ extern void IncrementVarSublevelsUp(Node *node, int delta_sublevels_up,
 extern void IncrementVarSublevelsUp_rtable(List *rtable,
 										   int delta_sublevels_up, int min_sublevels_up);
 
-extern void SetVarReturningType(Node *node, int result_relation, int sublevels_up,
-								VarReturningType returning_type);
-
 extern bool rangeTableEntry_used(Node *node, int rt_index,
 								 int sublevels_up);
 
@@ -92,6 +89,12 @@ extern Node *map_variable_attnos(Node *node,
 								 const struct AttrMap *attno_map,
 								 Oid to_rowtype, bool *found_whole_row);
 
+extern Node *ReplaceVarFromTargetList(Var *var,
+									  RangeTblEntry *target_rte,
+									  List *targetlist,
+									  int result_relation,
+									  ReplaceVarsNoMatchOption nomatch_option,
+									  int nomatch_varno);
 extern Node *ReplaceVarsFromTargetList(Node *node,
 									   int target_varno, int sublevels_up,
 									   RangeTblEntry *target_rte,
-- 
2.43.0

Reply via email to