Re: Making CASE error handling less surprising

Tom Lane Fri, 24 Jul 2020 08:23:22 -0700

I wrote:
> Ah, I see what you mean.  Yeah, that throws an error today, and it
> still would with the patch I was envisioning (attached), because
> inlining does Param substitution in a different way.  I'm not
> sure that we could realistically fix the inlining case with this
> sort of approach.


Here's another example that we can't possibly fix with Param substitution
hacking, because there are no Params involved in the first place:

select f1, case when f1 = 42 then 1/i else null end
from (select f1, 0 as i from int4_tbl) ss;

Pulling up the subquery results in "1/0", so this fails today,
even though "f1 = 42" is never true.

Attached is a v3 patch that incorporates the leakproofness idea.
As shown in the new case.sql tests, this does fix both the SQL
function and subquery-pullup cases.

To keep the join regression test results the same, I marked int48()
as leakproof, which is surely safe enough.  Probably we should make
a push to mark all unconditionally-safe implicit coercions as
leakproof, but that's a separate matter.

                        regards, tom lane

diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index e04b144072..48e31b16d6 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -61,6 +61,16 @@ typedef struct
 	AggClauseCosts *costs;
 } get_agg_clause_costs_context;
 
+typedef enum
+{
+	/* Ordering is important here! */
+	ece_eval_nothing,			/* be unconditionally safe */
+	ece_eval_leakproof,			/* eval leakproof immutable functions */
+	ece_eval_immutable,			/* eval immutable functions too */
+	ece_eval_stable,			/* eval stable functions too */
+	ece_eval_volatile			/* eval volatile functions too */
+} ece_eval_mode;
+
 typedef struct
 {
 	ParamListInfo boundParams;
@@ -68,6 +78,7 @@ typedef struct
 	List	   *active_fns;
 	Node	   *case_val;
 	bool		estimate;
+	ece_eval_mode eval_mode;
 } eval_const_expressions_context;
 
 typedef struct
@@ -119,6 +130,8 @@ static Node *eval_const_expressions_mutator(Node *node,
 static bool contain_non_const_walker(Node *node, void *context);
 static bool ece_function_is_safe(Oid funcid,
 								 eval_const_expressions_context *context);
+static bool ece_function_form_is_safe(Form_pg_proc func_form,
+									  eval_const_expressions_context *context);
 static Node *apply_const_relabel(Node *arg, Oid rtype,
 								 int32 rtypmod, Oid rcollid,
 								 CoercionForm rformat, int rlocation);
@@ -2264,6 +2277,7 @@ eval_const_expressions(PlannerInfo *root, Node *node)
 	context.active_fns = NIL;	/* nothing being recursively simplified */
 	context.case_val = NULL;	/* no CASE being examined */
 	context.estimate = false;	/* safe transformations only */
+	context.eval_mode = ece_eval_immutable; /* eval immutable functions */
 	return eval_const_expressions_mutator(node, &context);
 }
 
@@ -2280,8 +2294,11 @@ eval_const_expressions(PlannerInfo *root, Node *node)
  *	  available by the caller of planner(), even if the Param isn't marked
  *	  constant.  This effectively means that we plan using the first supplied
  *	  value of the Param.
- * 2. Fold stable, as well as immutable, functions to constants.
+ * 2. Fold stable, as well as immutable, functions to constants.  The risk
+ *	  that the result might change from planning time to execution time is
+ *	  worth taking, as we otherwise couldn't get an estimate at all.
  * 3. Reduce PlaceHolderVar nodes to their contained expressions.
+ * 4. Ignore domain constraints, assuming that CoerceToDomain will succeed.
  *--------------------
  */
 Node *
@@ -2295,6 +2312,7 @@ estimate_expression_value(PlannerInfo *root, Node *node)
 	context.active_fns = NIL;	/* nothing being recursively simplified */
 	context.case_val = NULL;	/* no CASE being examined */
 	context.estimate = true;	/* unsafe transformations OK */
+	context.eval_mode = ece_eval_stable;	/* eval stable functions */
 	return eval_const_expressions_mutator(node, &context);
 }
 
@@ -2961,7 +2979,22 @@ eval_const_expressions_mutator(Node *node,
 				 * opportunity to reduce constant test conditions.  For
 				 * example this allows
 				 *		CASE 0 WHEN 0 THEN 1 ELSE 1/0 END
-				 * to reduce to 1 rather than drawing a divide-by-0 error.
+				 * to reduce to 1 rather than drawing a divide-by-0 error,
+				 * since we'll throw away the ELSE without processing it.
+				 *
+				 * Even in not-all-constant cases, the user might be expecting
+				 * the CASE to prevent run-time errors, for example in
+				 *		CASE WHEN j > 0 THEN 1 ELSE 1/0 END
+				 * Since division is immutable, we'd ordinarily simplify the
+				 * division and hence draw the divide-by-zero error at plan
+				 * time, even if j is never zero at run time.  To avoid that,
+				 * reduce eval_mode to ece_eval_leakproof while processing any
+				 * test condition or result value that will not certainly be
+				 * evaluated at run-time.  We expect that leakproof immutable
+				 * functions will not throw any errors (except perhaps in
+				 * corner cases such as OOM, which we need not guard against
+				 * for this purpose).
+				 *
 				 * Note that when the test expression is constant, we don't
 				 * have to include it in the resulting CASE; for example
 				 *		CASE 0 WHEN x THEN y ELSE z END
@@ -2977,6 +3010,7 @@ eval_const_expressions_mutator(Node *node,
 				 */
 				CaseExpr   *caseexpr = (CaseExpr *) node;
 				CaseExpr   *newcase;
+				ece_eval_mode save_eval_mode = context->eval_mode;
 				Node	   *save_case_val;
 				Node	   *newarg;
 				List	   *newargs;
@@ -3027,6 +3061,16 @@ eval_const_expressions_mutator(Node *node,
 						const_true_cond = true;
 					}
 
+					/*
+					 * Unless the test condition is constant TRUE, we can't be
+					 * sure the result value will be evaluated, so become more
+					 * conservative about which functions to execute.  This
+					 * change will also apply to subsequent test conditions
+					 * and result values.
+					 */
+					if (!const_true_cond)
+						context->eval_mode = ece_eval_leakproof;
+
 					/* Simplify this alternative's result value */
 					caseresult = eval_const_expressions_mutator((Node *) oldcasewhen->result,
 																context);
@@ -3058,6 +3102,7 @@ eval_const_expressions_mutator(Node *node,
 															   context);
 
 				context->case_val = save_case_val;
+				context->eval_mode = save_eval_mode;
 
 				/*
 				 * If no non-FALSE alternatives, CASE reduces to the default
@@ -3113,6 +3158,7 @@ eval_const_expressions_mutator(Node *node,
 			{
 				CoalesceExpr *coalesceexpr = (CoalesceExpr *) node;
 				CoalesceExpr *newcoalesce;
+				ece_eval_mode save_eval_mode = context->eval_mode;
 				List	   *newargs;
 				ListCell   *arg;
 
@@ -3137,13 +3183,25 @@ eval_const_expressions_mutator(Node *node,
 						if (((Const *) e)->constisnull)
 							continue;	/* drop null constant */
 						if (newargs == NIL)
+						{
+							context->eval_mode = save_eval_mode;
 							return e;	/* first expr */
+						}
 						newargs = lappend(newargs, e);
 						break;
 					}
 					newargs = lappend(newargs, e);
+
+					/*
+					 * Arguments following a non-constant argument may or may
+					 * not get evaluated at run-time, so as in CASE, become
+					 * more conservative about which functions to execute.
+					 */
+					context->eval_mode = ece_eval_leakproof;
 				}
 
+				context->eval_mode = save_eval_mode;
+
 				/*
 				 * If all the arguments were constant null, the result is just
 				 * null
@@ -3163,13 +3221,12 @@ eval_const_expressions_mutator(Node *node,
 		case T_SQLValueFunction:
 			{
 				/*
-				 * All variants of SQLValueFunction are stable, so if we are
-				 * estimating the expression's value, we should evaluate the
-				 * current function value.  Otherwise just copy.
+				 * All variants of SQLValueFunction are stable, so evaluate if
+				 * we are evaluating stable functions.  Otherwise just copy.
 				 */
 				SQLValueFunction *svf = (SQLValueFunction *) node;
 
-				if (context->estimate)
+				if (context->eval_mode >= ece_eval_stable)
 					return (Node *) evaluate_expr((Expr *) svf,
 												  svf->type,
 												  svf->typmod,
@@ -3565,20 +3622,42 @@ contain_non_const_walker(Node *node, void *context)
 static bool
 ece_function_is_safe(Oid funcid, eval_const_expressions_context *context)
 {
-	char		provolatile = func_volatile(funcid);
+	bool		result;
+	HeapTuple	tp;
 
-	/*
-	 * Ordinarily we are only allowed to simplify immutable functions. But for
-	 * purposes of estimation, we consider it okay to simplify functions that
-	 * are merely stable; the risk that the result might change from planning
-	 * time to execution time is worth taking in preference to not being able
-	 * to estimate the value at all.
-	 */
-	if (provolatile == PROVOLATILE_IMMUTABLE)
-		return true;
-	if (context->estimate && provolatile == PROVOLATILE_STABLE)
-		return true;
-	return false;
+	tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+	if (!HeapTupleIsValid(tp))
+		elog(ERROR, "cache lookup failed for function %u", funcid);
+
+	result = ece_function_form_is_safe((Form_pg_proc) GETSTRUCT(tp), context);
+	ReleaseSysCache(tp);
+	return result;
+}
+
+/*
+ * Same, when we have the pg_proc entry directly at hand
+ */
+static bool
+ece_function_form_is_safe(Form_pg_proc func_form,
+						  eval_const_expressions_context *context)
+{
+	ece_eval_mode f_mode;
+
+	/* Must map the function properties to an ordered enum */
+	if (func_form->provolatile == PROVOLATILE_IMMUTABLE)
+	{
+		if (func_form->proleakproof)
+			f_mode = ece_eval_leakproof;
+		else
+			f_mode = ece_eval_immutable;
+	}
+	else if (func_form->provolatile == PROVOLATILE_STABLE)
+		f_mode = ece_eval_stable;
+	else
+		f_mode = ece_eval_volatile;
+
+	/* Now, does eval_mode allow evaluation of this function? */
+	return (context->eval_mode >= f_mode);
 }
 
 /*
@@ -4238,9 +4317,8 @@ recheck_cast_function_args(List *args, Oid result_type, HeapTuple func_tuple)
  * evaluate_function: try to pre-evaluate a function call
  *
  * We can do this if the function is strict and has any constant-null inputs
- * (just return a null constant), or if the function is immutable and has all
- * constant inputs (call it and return the result as a Const node).  In
- * estimation mode we are willing to pre-evaluate stable functions too.
+ * (just return a null constant), or if the function is safe to evaluate and
+ * has all constant inputs (call it and return the result as a Const node).
  *
  * Returns a simplified expression if successful, or NULL if cannot
  * simplify the function.
@@ -4293,7 +4371,7 @@ evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
 	 * If the function is strict and has a constant-NULL input, it will never
 	 * be called at all, so we can replace the call by a NULL constant, even
 	 * if there are other inputs that aren't constant, and even if the
-	 * function is not otherwise immutable.
+	 * function is not otherwise safe to evaluate.
 	 */
 	if (funcform->proisstrict && has_null_input)
 		return (Expr *) makeNullConst(result_type, result_typmod,
@@ -4308,17 +4386,9 @@ evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
 		return NULL;
 
 	/*
-	 * Ordinarily we are only allowed to simplify immutable functions. But for
-	 * purposes of estimation, we consider it okay to simplify functions that
-	 * are merely stable; the risk that the result might change from planning
-	 * time to execution time is worth taking in preference to not being able
-	 * to estimate the value at all.
+	 * Are we permitted to evaluate this function in the current context?
 	 */
-	if (funcform->provolatile == PROVOLATILE_IMMUTABLE)
-		 /* okay */ ;
-	else if (context->estimate && funcform->provolatile == PROVOLATILE_STABLE)
-		 /* okay */ ;
-	else
+	if (!ece_function_form_is_safe(funcform, context))
 		return NULL;
 
 	/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4b5af32440..5a3fdcc136 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1338,7 +1338,7 @@
   proname => 'int4', prorettype => 'int4', proargtypes => 'int8',
   prosrc => 'int84' },
 { oid => '481', descr => 'convert int4 to int8',
-  proname => 'int8', prorettype => 'int8', proargtypes => 'int4',
+  proname => 'int8', proleakproof => 't', prorettype => 'int8', proargtypes => 'int4',
   prosrc => 'int48' },
 { oid => '482', descr => 'convert int8 to float8',
   proname => 'float8', prorettype => 'float8', proargtypes => 'int8',
diff --git a/src/test/regress/expected/case.out b/src/test/regress/expected/case.out
index c0c8acf035..0b23057ee4 100644
--- a/src/test/regress/expected/case.out
+++ b/src/test/regress/expected/case.out
@@ -93,10 +93,62 @@ SELECT CASE 1 WHEN 0 THEN 1/0 WHEN 1 THEN 1 ELSE 2/0 END;
     1
 (1 row)
 
--- However we do not currently suppress folding of potentially
--- reachable subexpressions
 SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl;
+ case 
+------
+    0
+    0
+    0
+    0
+(4 rows)
+
+-- However, that guarantee doesn't extend into sub-selects
+SELECT CASE WHEN i > 100 THEN (select 1/0) ELSE 0 END FROM case_tbl;
 ERROR:  division by zero
+-- Inline-ing a SQL function shouldn't cause a failure
+CREATE FUNCTION case_func(double precision, integer)
+RETURNS integer
+LANGUAGE SQL AS $$
+   SELECT CASE WHEN $1 = 42 THEN 1/$2 ELSE NULL END
+$$;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT case_func(f, 0) FROM case_tbl;
+                                      QUERY PLAN                                      
+--------------------------------------------------------------------------------------
+ Seq Scan on public.case_tbl
+   Output: CASE WHEN (f = '42'::double precision) THEN (1 / 0) ELSE NULL::integer END
+(2 rows)
+
+SELECT case_func(f, 0) FROM case_tbl;
+ case_func 
+-----------
+          
+          
+          
+          
+(4 rows)
+
+DROP FUNCTION case_func(double precision, integer);
+-- Subquery flattening shouldn't result in such errors, either
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+                                                QUERY PLAN                                                 
+-----------------------------------------------------------------------------------------------------------
+ Seq Scan on public.case_tbl
+   Output: case_tbl.f, CASE WHEN (case_tbl.f = '42'::double precision) THEN (1 / 0) ELSE NULL::integer END
+(2 rows)
+
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+   f   | case 
+-------+------
+  10.1 |     
+  20.2 |     
+ -30.3 |     
+       |     
+(4 rows)
+
 -- Test for cases involving untyped literals in test expression
 SELECT CASE 'a' WHEN 'a' THEN 1 ELSE 2 END;
  case 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 27056d70d3..7b576b0208 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -634,6 +634,7 @@ int84lt(bigint,integer)
 int84gt(bigint,integer)
 int84le(bigint,integer)
 int84ge(bigint,integer)
+int8(integer)
 oidvectorne(oidvector,oidvector)
 namelt(name,name)
 namele(name,name)
diff --git a/src/test/regress/sql/case.sql b/src/test/regress/sql/case.sql
index 17436c524a..07ee4f8651 100644
--- a/src/test/regress/sql/case.sql
+++ b/src/test/regress/sql/case.sql
@@ -66,11 +66,33 @@ SELECT '7' AS "None",
 -- Constant-expression folding shouldn't evaluate unreachable subexpressions
 SELECT CASE WHEN 1=0 THEN 1/0 WHEN 1=1 THEN 1 ELSE 2/0 END;
 SELECT CASE 1 WHEN 0 THEN 1/0 WHEN 1 THEN 1 ELSE 2/0 END;
-
--- However we do not currently suppress folding of potentially
--- reachable subexpressions
 SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl;
 
+-- However, that guarantee doesn't extend into sub-selects
+SELECT CASE WHEN i > 100 THEN (select 1/0) ELSE 0 END FROM case_tbl;
+
+-- Inline-ing a SQL function shouldn't cause a failure
+CREATE FUNCTION case_func(double precision, integer)
+RETURNS integer
+LANGUAGE SQL AS $$
+   SELECT CASE WHEN $1 = 42 THEN 1/$2 ELSE NULL END
+$$;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT case_func(f, 0) FROM case_tbl;
+
+SELECT case_func(f, 0) FROM case_tbl;
+
+DROP FUNCTION case_func(double precision, integer);
+
+-- Subquery flattening shouldn't result in such errors, either
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+
+SELECT f, CASE WHEN f = 42 THEN 1/j ELSE NULL END
+FROM (SELECT f, 0 AS j FROM case_tbl) ss;
+
 -- Test for cases involving untyped literals in test expression
 SELECT CASE 'a' WHEN 'a' THEN 1 ELSE 2 END;

Re: Making CASE error handling less surprising

Reply via email to