Re: SQLFunctionCache and generic plans

Alexander Pyhalov Wed, 29 Jan 2025 06:36:14 -0800

Tom Lane писал(а) 2025-01-17 21:27:

Alexander Pyhalov <a.pyha...@postgrespro.ru> writes:

I've rebased patch on master. Tests pass here.


The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.


Hi. Thank you for review.

I've updated patch.

I've looked through the patch briefly and have a few thoughts:

* You cannot use plancache.c like this:

plansource = CreateCachedPlan(NULL, fcache->src,CreateCommandTag((Node *)parsetree));


CreateCachedPlan supposes that raw_parse_tree == NULL means an empty
query.  Now aside from the fact that you didn't update the comment
documenting that, this won't work, because RevalidateCachedQuery
also knows that raw_parse_tree == NULL means an empty query.  If an
invalidation happens while the function is running, revalidation
will produce an empty plan list, causing failure.  (It seems like
a regression test case exercising such invalidation is called for.)

Yes, yes... I've spent some time, trying to reproduce it or to provethat

plan revalidation is impossible (as I considered). I was wrong, plan
revaldiation is possible, and I've added some test cases. First of all,
we don't save our plans to CacheMemoryContext, so usual invalidation
callbacks don't touch our cached plans. However,  I've managed
to reproduce it by changing row_security GUC in SQL function. So,
after the first round of creating plan and running them, on second call
we can see that plansource->rewriteRowSecurity mismatches row_security
and invalidate the plan. I've added Query to plansource, and rewrite it
in RevalidateCachedQuery(), as you suggested.

Interesting side effect - earlier if row_security was altered in sqlfunction,this didn't have impact on further function execution (generic plan wasalready built).

Now, if revalidation for generic plan is triggered, query ends up with

"ERROR: query would be affected by row-level security policy fortable".


* I'm not very convinced by the new logic in
StmtPlanRequiresRevalidation.  Is it really the case that "check for
CMD_UTILITY" is sufficient?  Even if the coding is okay, the comment
needs work to provide a better explanation why it's okay.


Updated logic to match stmt_requires_parse_analysis().


* I'd say lose the enable_sql_func_custom_plans GUC.  We don't have
anything comparable for plpgsql and there's been (relatively) little
complaint about that.  Aside from adding clutter to the patch, it
contorts the logic in functions.c because it forces you to support
two code paths.


Removed it.


* The regression output changes like

-CONTEXT:  SQL function "regexp_match" statement 1
+CONTEXT:  SQL function "regexp_match" during startup

seem fairly unfortunate.  Aside from making maintenance of the patch
painful, the changed messages provide strictly less information to the
user.  I would try hard to eliminate that behavioral change.  I think
you could do so, or at least greatly reduce the number of diffs, by
having init_sql_fcache perform only raw parsing (producing a list of
RawStmts, or extracting a list of Queries from prosqlbody) and
delaying creation of a plan for a particular statement until you first
reach execution of that statement.  This would be a better way all
around because it'd also fix the anomalies I mentioned upthread for
cases where a DDL statement earlier in the function should affect
parse analysis of a later statement.


Yes, I've seen this remark. However, was a bit frightened to touch
all guts of functions.c processing. So, instead I've just recorded
a statement number of the currently-planning query and use it in
error messages. If you insist, I can try rewriting current
first-plan-all-then-run approach, but without good tests for
sql functions this looks error-prone.


* release_plans seems kind of wrong: isn't it giving up most of
the benefit of doing this?  plpgsql doesn't seem to release plans
unless it's forced to revalidate.  Why should SQL functions
behave differently?


The issue is that we potentially create custom plan for each set
of sql function arguments. And in queries like
SELECT f(field) FROM T1
this can consume much memory. The good thing here is that we don't care
much about it if execution switches to generic plans -
after some rows are selected and some custom plans are built.

Of course, there are bad cases. For example, we've seen that
some simple functions like

create function test(n int) returns text as $$
select string_agg(i::text, '') FROM generate_series(1,n) i
$$
language sql;

always use custom plan for small n numbers - it's the same, but cheaper

(as we can't predict correct row number without knowing functionarguments):


Aggregate  (cost=0.15..0.16 rows=1 width=32)

-> Function Scan on generate_series i (cost=0.00..0.08 rows=8width=4)


versus

Aggregate  (cost=17.50..17.52 rows=1 width=32)

-> Function Scan on generate_series i (cost=0.00..10.00rows=1000 width=4)

So if it appears somewhere deep in data generation scenarios, planningtime

can sufficiently rise (earlier generic plan was always used).

Turning back to release_plans(), it's needed, because there'spossibilitythat for each set of arguments we create new custom plan. It doesn'taffect

generic plan, as plansource itself holds reference to it.

Also, I'm pretty sure it's wrong that
you have
+               ReleaseCachedPlan(cplan, NULL);
with no ResourceOwner tracking the reference.  That probably
means that the code leaks cached plans on error.  For the
current patch iteration with the function's data structure only
meant to live as long as the query, it should be sufficient to
use CurrentResourceOwner to manage these cplan references.

Cached plans are not stored in long-lived context for now. So we can'tuse CurrentResourceOwner,ReleaseCachedPlan(cplan, CurrentResourceOwner) will die on"Assert(plan->is_saved)".


* The patch is hard to read in functions.c because there's a mix of
code-motion and actual logic changes that are touching the same
places.  Perhaps it'd be worth splitting it into a series of two
patches: the first one just does code motion such as pushing existing
logic into a new subroutine, and then the second one has the logic
changes.  Or maybe that wouldn't help much, but consider giving it a
try.


Moved splitting of check_planned_stmt() into separate patch.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

From 4bfdc5d99610edd5621271e190588581fc25758b Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov <a.pyha...@postgrespro.ru>
Date: Thu, 23 Jan 2025 14:34:17 +0300
Subject: [PATCH 2/2] Use custom plan machinery for SQL function

---
 src/backend/executor/functions.c          | 201 ++++++++++++++++++----
 src/backend/utils/cache/plancache.c       |  89 ++++++++--
 src/backend/utils/misc/guc_tables.c       |   1 +
 src/include/utils/plancache.h             |   5 +
 src/test/regress/expected/rowsecurity.out |  51 ++++++
 src/test/regress/expected/rules.out       |  35 ++++
 src/test/regress/sql/rowsecurity.sql      |  41 +++++
 src/test/regress/sql/rules.sql            |  24 +++
 8 files changed, 398 insertions(+), 49 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 12565d0a7e0..b4794d1a425 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -33,10 +33,10 @@
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/plancache.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 
-
 /*
  * Specialized DestReceiver for collecting query output in a SQL function
  */
@@ -112,6 +112,13 @@ typedef struct
 
 	JunkFilter *junkFilter;		/* will be NULL if function returns VOID */
 
+	/* Cached plans support */
+	List	   *queryTree_list; /* list of query lists */
+	List	   *plansource_list;	/* list of plansource */
+	List	   *cplan_list;		/* list of cached plans */
+	int			planning_stmt_number;	/* the number of statement we are
+										 * currently planning */
+
 	/*
 	 * func_state is a List of execution_state records, each of which is the
 	 * first for its original parsetree, with any additional records chained
@@ -500,45 +507,47 @@ init_execution_state(List *queryTree_list,
 					 bool lazyEvalOK)
 {
 	List	   *eslist = NIL;
+	List	   *cplan_list = NIL;
 	execution_state *lasttages = NULL;
-	ListCell   *lc1;
+
+	/* We use lc1 index, not lc1 itself, so mark it unused */
+	ListCell   *lc1 pg_attribute_unused();
 
 	foreach(lc1, queryTree_list)
 	{
-		List	   *qtlist = lfirst_node(List, lc1);
 		execution_state *firstes = NULL;
 		execution_state *preves = NULL;
 		ListCell   *lc2;
+		CachedPlan *cplan;
+		CachedPlanSource *plansource;
+		int			cur_idx;
+
+		/* Find plan source, corresponding to this query list */
+		cur_idx = foreach_current_index(lc1);
+		plansource = list_nth(fcache->plansource_list, cur_idx);
+		/* Save statement number for error reporting */
+		fcache->planning_stmt_number = cur_idx + 1 /* cur_idx starts with 0 */ ;
+
+		/*
+		 * Get plan for the query. If paramLI is set, we can get custom plan
+		 */
+		cplan = GetCachedPlan(plansource, fcache->paramLI, NULL, NULL);
+
+		/* Record cplan in plan list to be released on replanning */
+		cplan_list = lappend(cplan_list, cplan);
 
-		foreach(lc2, qtlist)
+		/* For each planned statement create execution state */
+		foreach(lc2, cplan->stmt_list)
 		{
-			Query	   *queryTree = lfirst_node(Query, lc2);
-			PlannedStmt *stmt;
+			PlannedStmt *stmt = lfirst(lc2);
 			execution_state *newes;
 
-			/* Plan the query if needed */
-			if (queryTree->commandType == CMD_UTILITY)
-			{
-				/* Utility commands require no planning. */
-				stmt = makeNode(PlannedStmt);
-				stmt->commandType = CMD_UTILITY;
-				stmt->canSetTag = queryTree->canSetTag;
-				stmt->utilityStmt = queryTree->utilityStmt;
-				stmt->stmt_location = queryTree->stmt_location;
-				stmt->stmt_len = queryTree->stmt_len;
-				stmt->queryId = queryTree->queryId;
-			}
-			else
-				stmt = pg_plan_query(queryTree,
-									 fcache->src,
-									 CURSOR_OPT_PARALLEL_OK,
-									 NULL);
-
 			/* Check that stmt is valid for SQL function */
 			check_planned_stmt(stmt, fcache);
 
 			/* OK, build the execution_state for this query */
 			newes = (execution_state *) palloc(sizeof(execution_state));
+
 			if (preves)
 				preves->next = newes;
 			else
@@ -551,7 +560,7 @@ init_execution_state(List *queryTree_list,
 			newes->stmt = stmt;
 			newes->qd = NULL;
 
-			if (queryTree->canSetTag)
+			if (stmt->canSetTag)
 				lasttages = newes;
 
 			preves = newes;
@@ -583,6 +592,9 @@ init_execution_state(List *queryTree_list,
 			fcache->lazyEval = lasttages->lazyEval = true;
 	}
 
+	/* We've finished planning, reset planning statement number */
+	fcache->planning_stmt_number = 0;
+	fcache->cplan_list = cplan_list;
 	return eslist;
 }
 
@@ -606,6 +618,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
 	ListCell   *lc;
 	Datum		tmp;
 	bool		isNull;
+	List	   *plansource_list;
 
 	/*
 	 * Create memory context that holds all the SQLFunctionCache data.  It
@@ -690,6 +703,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
 	 * plancache.c.
 	 */
 	queryTree_list = NIL;
+	plansource_list = NIL;
 	if (!isNull)
 	{
 		Node	   *n;
@@ -705,8 +719,13 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
 		{
 			Query	   *parsetree = lfirst_node(Query, lc);
 			List	   *queryTree_sublist;
+			CachedPlanSource *plansource;
 
 			AcquireRewriteLocks(parsetree, true, false);
+
+			plansource = CreateCachedPlanForQuery(parsetree, fcache->src, CreateCommandTag((Node *) parsetree));
+			plansource_list = lappend(plansource_list, plansource);
+
 			queryTree_sublist = pg_rewrite_query(parsetree);
 			queryTree_list = lappend(queryTree_list, queryTree_sublist);
 		}
@@ -721,6 +740,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
 		{
 			RawStmt    *parsetree = lfirst_node(RawStmt, lc);
 			List	   *queryTree_sublist;
+			CachedPlanSource *plansource;
+
+			plansource = CreateCachedPlan(parsetree, fcache->src, CreateCommandTag(parsetree->stmt));
+			plansource_list = lappend(plansource_list, plansource);
 
 			queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree,
 															  fcache->src,
@@ -761,6 +784,34 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
 											   false,
 											   &resulttlist);
 
+	/*
+	 * Queries could be rewritten by check_sql_fn_retval(). Now when they have
+	 * their final form, we can complete plan cache entry creation.
+	 */
+	if (plansource_list != NIL)
+	{
+		ListCell   *qlc;
+		ListCell   *plc;
+
+		forboth(qlc, queryTree_list, plc, plansource_list)
+		{
+			List	   *queryTree_sublist = lfirst(qlc);
+			CachedPlanSource *plansource = lfirst(plc);
+
+
+			/* Finish filling in the CachedPlanSource */
+			CompleteCachedPlan(plansource,
+							   queryTree_sublist,
+							   NULL,
+							   NULL,
+							   0,
+							   (ParserSetupHook) sql_fn_parser_setup,
+							   fcache->pinfo,
+							   CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL,
+							   false);
+		}
+	}
+
 	/*
 	 * Construct a JunkFilter we can use to coerce the returned rowtype to the
 	 * desired form, unless the result type is VOID, in which case there's
@@ -805,10 +856,8 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK)
 		lazyEvalOK = true;
 	}
 
-	/* Finally, plan the queries */
-	fcache->func_state = init_execution_state(queryTree_list,
-											  fcache,
-											  lazyEvalOK);
+	fcache->queryTree_list = queryTree_list;
+	fcache->plansource_list = plansource_list;
 
 	/* Mark fcache with time of creation to show it's valid */
 	fcache->lxid = MyProc->vxid.lxid;
@@ -979,7 +1028,12 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
 			prm->value = MakeExpandedObjectReadOnly(fcinfo->args[i].value,
 													prm->isnull,
 													get_typlen(argtypes[i]));
-			prm->pflags = 0;
+
+			/*
+			 * PARAM_FLAG_CONST is necessary to build efficient custom plan.
+			 */
+			prm->pflags = PARAM_FLAG_CONST;
+
 			prm->ptype = argtypes[i];
 		}
 	}
@@ -1032,6 +1086,33 @@ postquel_get_single_result(TupleTableSlot *slot,
 	return value;
 }
 
+/*
+ * Release plans. This function is called prior to planning
+ * statements with new parameters. When custom plans are generated
+ * for each function call in a statement, they can consume too much memory, so
+ * release them. Generic plans will survive it as plansource holds
+ * reference to a generic plan.
+ */
+static void
+release_plans(List *cplans)
+{
+	ListCell   *lc;
+
+	/*
+	 * We support separate plan list, so that we visit each plan here only
+	 * once
+	 */
+	foreach(lc, cplans)
+	{
+		CachedPlan *cplan = lfirst(lc);
+
+		ReleaseCachedPlan(cplan, NULL);
+	}
+
+	/* Cleanup the list itself */
+	list_free(cplans);
+}
+
 /*
  * fmgr_sql: function call manager for SQL functions
  */
@@ -1050,6 +1131,7 @@ fmgr_sql(PG_FUNCTION_ARGS)
 	Datum		result;
 	List	   *eslist;
 	ListCell   *eslc;
+	bool		build_cached_plans = false;
 
 	/*
 	 * Setup error traceback support for ereport()
@@ -1139,12 +1221,55 @@ fmgr_sql(PG_FUNCTION_ARGS)
 			break;
 	}
 
+	/*
+	 * We skip actual planning for initial run, so in this case we have to
+	 * build cached plans now.
+	 */
+	if (fcache->plansource_list != NIL && eslist == NIL)
+		build_cached_plans = true;
+
 	/*
 	 * Convert params to appropriate format if starting a fresh execution. (If
 	 * continuing execution, we can re-use prior params.)
 	 */
-	if (is_first && es && es->status == F_EXEC_START)
+	if ((is_first && es && es->status == F_EXEC_START) || build_cached_plans)
+	{
 		postquel_sub_params(fcache, fcinfo);
+		if (fcache->plansource_list)
+		{
+			ListCell   *lc;
+
+			/* replan the queries */
+			release_plans(fcache->cplan_list);
+
+			foreach(lc, fcache->func_state)
+			{
+				execution_state *cur_es = lfirst(lc);
+
+				/*
+				 * Execution state can be NULL if statement is completely
+				 * replaced by do instead nothing rule.
+				 */
+				if (cur_es)
+					pfree(cur_es);
+			}
+			list_free(fcache->func_state);
+
+			fcache->func_state = init_execution_state(fcache->queryTree_list,
+													  fcache,
+													  lazyEvalOK);
+			/* restore execution state and eslist-related variables */
+			eslist = fcache->func_state;
+			/* find the first non-NULL execution state */
+			foreach(eslc, eslist)
+			{
+				es = (execution_state *) lfirst(eslc);
+
+				if (es)
+					break;
+			}
+		}
+	}
 
 	/*
 	 * Build tuplestore to hold results, if we don't have one already. Note
@@ -1438,13 +1563,19 @@ sql_exec_error_callback(void *arg)
 	}
 
 	/*
-	 * Try to determine where in the function we failed.  If there is a query
-	 * with non-null QueryDesc, finger it.  (We check this rather than looking
-	 * for F_EXEC_RUN state, so that errors during ExecutorStart or
+	 * Try to determine where in the function we failed.  If failure happens
+	 * while building plans, look at planning_stmt_number.  Else if there is a
+	 * query with non-null QueryDesc, finger it.  (We check this rather than
+	 * looking for F_EXEC_RUN state, so that errors during ExecutorStart or
 	 * ExecutorEnd are blamed on the appropriate query; see postquel_start and
 	 * postquel_end.)
 	 */
-	if (fcache->func_state)
+	if (fcache->planning_stmt_number)
+	{
+		errcontext("SQL function \"%s\" statement %d",
+				   fcache->fname, fcache->planning_stmt_number);
+	}
+	else if (fcache->func_state)
 	{
 		execution_state *es;
 		int			query_num;
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 55db8f53705..8ecba5a741d 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -74,18 +74,6 @@
 #include "utils/syscache.h"
 
 
-/*
- * We must skip "overhead" operations that involve database access when the
- * cached plan's subject statement is a transaction control command or one
- * that requires a snapshot not to be set yet (such as SET or LOCK).  More
- * generally, statements that do not require parse analysis/rewrite/plan
- * activity never need to be revalidated, so we can treat them all like that.
- * For the convenience of postgres.c, treat empty statements that way too.
- */
-#define StmtPlanRequiresRevalidation(plansource)  \
-	((plansource)->raw_parse_tree != NULL && \
-	 stmt_requires_parse_analysis((plansource)->raw_parse_tree))
-
 /*
  * This is the head of the backend's list of "saved" CachedPlanSources (i.e.,
  * those that are in long-lived storage and are examined for sinval events).
@@ -130,6 +118,50 @@ static const ResourceOwnerDesc planref_resowner_desc =
 	.DebugPrint = NULL			/* the default message is fine */
 };
 
+/*
+ * We must skip "overhead" operations that involve database access when the
+ * cached plan's subject statement is a transaction control command or one
+ * that requires a snapshot not to be set yet (such as SET or LOCK).  More
+ * generally, statements that do not require parse analysis/rewrite/plan
+ * activity never need to be revalidated, so we can treat them all like that.
+ * For the convenience of postgres.c, treat empty statements that way too.
+ * If plansource doesn't have raw_parse_tree, look at query_list to find out
+ * if there are any non-utility statements. Also some utility statements
+ * can require revalidation. The logic is the same as in stmt_requires_parse_analysis().
+ */
+static inline bool
+StmtPlanRequiresRevalidation(CachedPlanSource *plansource)
+{
+	if (plansource->raw_parse_tree != NULL)
+	{
+		return stmt_requires_parse_analysis(plansource->raw_parse_tree);
+	}
+	else
+	{
+		ListCell   *lc;
+
+		foreach(lc, plansource->query_list)
+		{
+			Query	   *query = castNode(Query, lfirst(lc));
+
+			if (query->commandType != CMD_UTILITY)
+				return true;
+
+			/* should match stmt_requires_parse_analysis() */
+			switch (nodeTag(query->utilityStmt))
+			{
+				case T_DeclareCursorStmt:
+				case T_ExplainStmt:
+				case T_CreateTableAsStmt:
+				case T_CallStmt:
+					return true;
+				default:
+			}
+		}
+	}
+	return false;
+}
+
 /* Convenience wrappers over ResourceOwnerRemember/Forget */
 static inline void
 ResourceOwnerRememberPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
@@ -184,7 +216,8 @@ InitPlanCache(void)
  * Once constructed, the cached plan can be made longer-lived, if needed,
  * by calling SaveCachedPlan.
  *
- * raw_parse_tree: output of raw_parser(), or NULL if empty query
+ * raw_parse_tree: output of raw_parser(), or NULL if empty query, can
+ *   also be NULL if plansource->analyzed_parse_tree is set instead
  * query_string: original query text
  * commandTag: command tag for query, or UNKNOWN if empty query
  */
@@ -254,6 +287,27 @@ CreateCachedPlan(RawStmt *raw_parse_tree,
 	return plansource;
 }
 
+/*
+ * CreateCachedPlanForQuery: initially create a plan cache entry
+ * for parsed and analyzed query. Unlike CreateCachedPlan(),
+ * it preserves analyzed parse tree, not raw parse tree.
+ */
+CachedPlanSource *
+CreateCachedPlanForQuery(Query *analyzed_parse_tree,
+						 const char *query_string,
+						 CommandTag commandTag)
+{
+	CachedPlanSource *plansource;
+	MemoryContext oldcxt;
+
+	plansource = CreateCachedPlan(NULL, query_string, commandTag);
+	oldcxt = MemoryContextSwitchTo(plansource->context);
+	plansource->analyzed_parse_tree = copyObject(analyzed_parse_tree);
+	MemoryContextSwitchTo(oldcxt);
+
+	return plansource;
+}
+
 /*
  * CreateOneShotCachedPlan: initially create a one-shot plan cache entry.
  *
@@ -708,7 +762,14 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	 */
 	rawtree = copyObject(plansource->raw_parse_tree);
 	if (rawtree == NULL)
-		tlist = NIL;
+	{
+		/* Working on pre-analyzed query */
+		if (plansource->analyzed_parse_tree)
+			tlist = pg_rewrite_query(plansource->analyzed_parse_tree);
+		else
+			/* Utility command */
+			tlist = NIL;
+	}
 	else if (plansource->parserSetup != NULL)
 		tlist = pg_analyze_and_rewrite_withcb(rawtree,
 											  plansource->query_string,
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d5..f5f5abed595 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -46,6 +46,7 @@
 #include "commands/vacuum.h"
 #include "common/file_utils.h"
 #include "common/scram-common.h"
+#include "executor/functions.h"
 #include "jit/jit.h"
 #include "libpq/auth.h"
 #include "libpq/libpq.h"
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 46072d311b1..1493f726649 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -25,6 +25,7 @@
 
 /* Forward declaration, to avoid including parsenodes.h here */
 struct RawStmt;
+struct Query;
 
 /* possible values for plan_cache_mode */
 typedef enum
@@ -97,6 +98,7 @@ typedef struct CachedPlanSource
 {
 	int			magic;			/* should equal CACHEDPLANSOURCE_MAGIC */
 	struct RawStmt *raw_parse_tree; /* output of raw_parser(), or NULL */
+	struct Query *analyzed_parse_tree;	/* analyzed parse tree or NULL */
 	const char *query_string;	/* source text of query */
 	CommandTag	commandTag;		/* 'nuff said */
 	Oid		   *param_types;	/* array of parameter type OIDs, or NULL */
@@ -193,6 +195,9 @@ extern void ReleaseAllPlanCacheRefsInOwner(ResourceOwner owner);
 extern CachedPlanSource *CreateCachedPlan(struct RawStmt *raw_parse_tree,
 										  const char *query_string,
 										  CommandTag commandTag);
+extern CachedPlanSource *CreateCachedPlanForQuery(struct Query *analyzed_parse_tree,
+												  const char *query_string,
+												  CommandTag commandTag);
 extern CachedPlanSource *CreateOneShotCachedPlan(struct RawStmt *raw_parse_tree,
 												 const char *query_string,
 												 CommandTag commandTag);
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index fd5654df35e..9c26ad67fdf 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -4666,6 +4666,57 @@ RESET ROLE;
 DROP FUNCTION rls_f();
 DROP VIEW rls_v;
 DROP TABLE rls_t;
+-- RLS changes invalidate cached function plans
+create table rls_t (c text);
+create table test_t (c text);
+insert into rls_t values ('a'), ('b'), ('c'), ('d');
+insert into test_t values ('a'), ('b');
+alter table rls_t enable row level security;
+grant select on rls_t to regress_rls_alice;
+grant select on test_t to regress_rls_alice;
+create policy p1 on rls_t for select to regress_rls_alice using (c = current_setting('rls_test.blah'));
+-- Function changes row_security setting and so invalidates plan
+create or replace function rls_f(text)
+ RETURNS text
+ LANGUAGE sql
+BEGIN ATOMIC
+ select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order by c) from rls_t;
+END;
+-- Table owner bypasses RLS
+select rls_f(c) from test_t order by rls_f;
+    rls_f    
+-------------
+ aoffa,b,c,d
+ boffa,b,c,d
+(2 rows)
+
+set role regress_rls_alice;
+-- For casual user changes in row_security setting lead
+-- to error during query rewrite
+select rls_f(c) from test_t order by rls_f;
+ERROR:  query would be affected by row-level security policy for table "rls_t"
+CONTEXT:  SQL function "rls_f" statement 1
+reset role;
+set plan_cache_mode to force_generic_plan;
+-- Table owner bypasses RLS, but cached plan invalidates
+select rls_f(c) from test_t order by rls_f;
+    rls_f    
+-------------
+ aoffa,b,c,d
+ boffa,b,c,d
+(2 rows)
+
+-- For casual user changes in row_security setting lead
+-- to plan invalidation and error during query rewrite
+set role regress_rls_alice;
+select rls_f(c) from test_t order by rls_f;
+ERROR:  query would be affected by row-level security policy for table "rls_t"
+CONTEXT:  SQL function "rls_f" statement 1
+reset role;
+reset plan_cache_mode;
+reset rls_test.blah;
+drop function rls_f;
+drop table rls_t, test_t;
 --
 -- Clean up objects
 --
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 3361f6a69c9..03960d9e696 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -3880,3 +3880,38 @@ DROP TABLE ruletest_t3;
 DROP TABLE ruletest_t2;
 DROP TABLE ruletest_t1;
 DROP USER regress_rule_user1;
+-- Test that SQL functions correctly handle DO NOTHING rule
+CREATE TABLE some_data (i int, data text);
+CREATE TABLE some_data_values (i int, data text);
+CREATE FUNCTION insert_data(i int, data text)
+RETURNS INT
+AS $$
+INSERT INTO some_data VALUES ($1, $2);
+SELECT 1;
+$$ LANGUAGE SQL;
+INSERT INTO some_data_values SELECT i , 'data'|| i FROM generate_series(1, 10) i;
+CREATE RULE some_data_noinsert AS ON INSERT TO some_data DO INSTEAD NOTHING;
+SELECT insert_data(i, data) FROM some_data_values;
+ insert_data 
+-------------
+           1
+           1
+           1
+           1
+           1
+           1
+           1
+           1
+           1
+           1
+(10 rows)
+
+SELECT * FROM some_data ORDER BY i;
+ i | data 
+---+------
+(0 rows)
+
+DROP RULE some_data_noinsert ON some_data;
+DROP TABLE some_data_values;
+DROP TABLE some_data;
+DROP FUNCTION insert_data(int, text);
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
index cf09f62eaba..418d8ea95fb 100644
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -2280,6 +2280,47 @@ DROP FUNCTION rls_f();
 DROP VIEW rls_v;
 DROP TABLE rls_t;
 
+-- RLS changes invalidate cached function plans
+create table rls_t (c text);
+create table test_t (c text);
+
+insert into rls_t values ('a'), ('b'), ('c'), ('d');
+insert into test_t values ('a'), ('b');
+alter table rls_t enable row level security;
+grant select on rls_t to regress_rls_alice;
+grant select on test_t to regress_rls_alice;
+create policy p1 on rls_t for select to regress_rls_alice using (c = current_setting('rls_test.blah'));
+
+-- Function changes row_security setting and so invalidates plan
+create or replace function rls_f(text)
+ RETURNS text
+ LANGUAGE sql
+BEGIN ATOMIC
+ select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order by c) from rls_t;
+END;
+
+-- Table owner bypasses RLS
+select rls_f(c) from test_t order by rls_f;
+set role regress_rls_alice;
+-- For casual user changes in row_security setting lead
+-- to error during query rewrite
+select rls_f(c) from test_t order by rls_f;
+reset role;
+
+set plan_cache_mode to force_generic_plan;
+-- Table owner bypasses RLS, but cached plan invalidates
+select rls_f(c) from test_t order by rls_f;
+-- For casual user changes in row_security setting lead
+-- to plan invalidation and error during query rewrite
+set role regress_rls_alice;
+select rls_f(c) from test_t order by rls_f;
+reset role;
+reset plan_cache_mode;
+reset rls_test.blah;
+
+drop function rls_f;
+drop table rls_t, test_t;
+
 --
 -- Clean up objects
 --
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index fdd3ff1d161..505449452ee 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -1432,3 +1432,27 @@ DROP TABLE ruletest_t2;
 DROP TABLE ruletest_t1;
 
 DROP USER regress_rule_user1;
+
+-- Test that SQL functions correctly handle DO NOTHING rule
+CREATE TABLE some_data (i int, data text);
+CREATE TABLE some_data_values (i int, data text);
+
+CREATE FUNCTION insert_data(i int, data text)
+RETURNS INT
+AS $$
+INSERT INTO some_data VALUES ($1, $2);
+SELECT 1;
+$$ LANGUAGE SQL;
+
+INSERT INTO some_data_values SELECT i , 'data'|| i FROM generate_series(1, 10) i;
+
+CREATE RULE some_data_noinsert AS ON INSERT TO some_data DO INSTEAD NOTHING;
+
+SELECT insert_data(i, data) FROM some_data_values;
+
+SELECT * FROM some_data ORDER BY i;
+
+DROP RULE some_data_noinsert ON some_data;
+DROP TABLE some_data_values;
+DROP TABLE some_data;
+DROP FUNCTION insert_data(int, text);
-- 
2.43.0

From 381e81e4a525296dc11c5e4cc3ffd6bbca91e8c0 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov <a.pyha...@postgrespro.ru>
Date: Wed, 29 Jan 2025 16:23:30 +0300
Subject: [PATCH 1/2] Split out SQL functions checks from
 init_execution_state()

---
 src/backend/executor/functions.c | 62 ++++++++++++++++++--------------
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 757f8068e21..12565d0a7e0 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -454,6 +454,40 @@ sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo,
 	return NULL;
 }
 
+/* Precheck command for validity in a function */
+static void
+check_planned_stmt(PlannedStmt *stmt, SQLFunctionCachePtr fcache)
+{
+
+	/*
+	 * Precheck all commands for validity in a function.  This should
+	 * generally match the restrictions spi.c applies.
+	 */
+	if (stmt->commandType == CMD_UTILITY)
+	{
+		if (IsA(stmt->utilityStmt, CopyStmt) &&
+			((CopyStmt *) stmt->utilityStmt)->filename == NULL)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("cannot COPY to/from client in an SQL function")));
+
+		if (IsA(stmt->utilityStmt, TransactionStmt))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			/* translator: %s is a SQL statement name */
+					 errmsg("%s is not allowed in an SQL function",
+							CreateCommandName(stmt->utilityStmt))));
+	}
+
+	if (fcache->readonly_func && !CommandIsReadOnly(stmt))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		/* translator: %s is a SQL statement name */
+				 errmsg("%s is not allowed in a non-volatile function",
+						CreateCommandName((Node *) stmt))));
+
+}
+
 /*
  * Set up the per-query execution_state records for a SQL function.
  *
@@ -500,32 +534,8 @@ init_execution_state(List *queryTree_list,
 									 CURSOR_OPT_PARALLEL_OK,
 									 NULL);
 
-			/*
-			 * Precheck all commands for validity in a function.  This should
-			 * generally match the restrictions spi.c applies.
-			 */
-			if (stmt->commandType == CMD_UTILITY)
-			{
-				if (IsA(stmt->utilityStmt, CopyStmt) &&
-					((CopyStmt *) stmt->utilityStmt)->filename == NULL)
-					ereport(ERROR,
-							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-							 errmsg("cannot COPY to/from client in an SQL function")));
-
-				if (IsA(stmt->utilityStmt, TransactionStmt))
-					ereport(ERROR,
-							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					/* translator: %s is a SQL statement name */
-							 errmsg("%s is not allowed in an SQL function",
-									CreateCommandName(stmt->utilityStmt))));
-			}
-
-			if (fcache->readonly_func && !CommandIsReadOnly(stmt))
-				ereport(ERROR,
-						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				/* translator: %s is a SQL statement name */
-						 errmsg("%s is not allowed in a non-volatile function",
-								CreateCommandName((Node *) stmt))));
+			/* Check that stmt is valid for SQL function */
+			check_planned_stmt(stmt, fcache);
 
 			/* OK, build the execution_state for this query */
 			newes = (execution_state *) palloc(sizeof(execution_state));
-- 
2.43.0

Re: SQLFunctionCache and generic plans

Reply via email to