Re: [HACKERS] [RFC] overflow checks optimized away

Michael Paquier Fri, 16 Oct 2015 07:30:55 -0700

On Fri, Oct 9, 2015 at 10:52 PM, Bruce Momjian <br...@momjian.us> wrote:
>
> Any news on this item from 2013, worked on again 2014?
>
> ---------------------------------------------------------------------------
>
> On Wed, Aug  6, 2014 at 12:55:59PM -0400, Bruce Momjian wrote:
>> On Fri, Nov 29, 2013 at 02:04:10AM +0000, Greg Stark wrote:
>> > Attached is what I have so far. I have to say I'm starting to come
>> > around to Tom's point of view. This is a lot of hassle for not much
>> > gain. i've noticed a number of other overflow checks that the llvm
>> > optimizer is not picking up on so even after this patch it's not clear
>> > that all the signed overflow checks that depend on -fwrapv will be
>> > gone.
>> >
>> > This patch still isn't quite finished though.
>> >
>> > a) It triggers a spurious gcc warning about overflows on constant
>> > expressions. These value of these expressions aren't actually being
>> > used as they're used in the other branch of the overflow test. I think
>> > I see how to work around it for gcc using __builtin_choose_expr but it
>> > might be pretty grotty.
>> >
>> > b) I'm concerned these checks depend on INT_MIN/MAX and SHRT_MIN/MAX
>> > which may not be exactly the right length. I'm kind of confused why
>> > c.h assumes long is 32 bits and short is 16 bits though so I don't
>> > think I'm making it any worse. It may be better for us to just define
>> > our own limits since we know exactly how large we expect these data
>> > types to be.
>> >
>> > c) I want to add regression tests that will ensure that the overflow
>> > checks are all working. So far I haven't been able to catch any being
>> > optimized away even with -fno-wrapv and -fstrict-overflow. I think I
>> > just didn't build with the right options though and it should be
>> > possible.
>> >
>> > The goal here imho is to allow building with -fno-wrapv
>> > -fstrict-overflow safely. Whether we actually do or not is another
>> > question but it means we would be able to use new compilers like clang
>> > without worrying about finding the equivalent of -fwrapv for them.
>>
>> Where are we on this?


Well, I have played a bit with this patch and rebased it as attached.
One major change is the use of the variables PG_INT* that have been
added in 62e2a8d. Some places were not updated with those new checks,
in majority a couple of routines in int.c (I haven't finished
monitoring the whole code though). Also, I haven't played yet with my
compilers to optimize away some of the checks and break them, but I'll
give it a try with clang and gcc. For now, I guess that this patch is
a good thing to begin with though, I have checked that code compiles
and regression tests pass.
Regards,
-- 
Michael

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 29f058c..9b4b890 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -3185,7 +3185,8 @@ ExecEvalArray(ArrayExprState *astate, ExprContext *econtext,
 				/* Get sub-array details from first member */
 				elem_ndims = this_ndims;
 				ndims = elem_ndims + 1;
-				if (ndims <= 0 || ndims > MAXDIM)
+
+				if (PG_INT32_ADD_OVERFLOWS(elem_ndims,1) || ndims > MAXDIM)
 					ereport(ERROR,
 							(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 						  errmsg("number of array dimensions (%d) exceeds " \
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index c14ea23..a6b0d1b 100644
--- a/src/backend/utils/adt/array_userfuncs.c
+++ b/src/backend/utils/adt/array_userfuncs.c
@@ -113,8 +113,9 @@ array_append(PG_FUNCTION_ARGS)
 		ub = dimv[0] + lb[0] - 1;
 		indx = ub + 1;
 
-		/* overflow? */
-		if (indx < ub)
+		/* check for overflow in upper bound (indx+1) */
+		if (PG_INT32_ADD_OVERFLOWS(dimv[0],lb[0]) ||
+			PG_INT32_ADD_OVERFLOWS(dimv[0]+lb[0], 1))
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("integer out of range")));
@@ -168,7 +169,7 @@ array_prepend(PG_FUNCTION_ARGS)
 		lb0 = lb[0];
 
 		/* overflow? */
-		if (indx > lb[0])
+		if (PG_INT32_ADD_OVERFLOWS(lb[0], -1))
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("integer out of range")));
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index 4e927d8..cc4c570 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -2763,12 +2763,12 @@ width_bucket_float8(PG_FUNCTION_ARGS)
 			result = 0;
 		else if (operand >= bound2)
 		{
-			result = count + 1;
 			/* check for overflow */
-			if (result < count)
+			if (PG_INT32_ADD_OVERFLOWS(count, 1))
 				ereport(ERROR,
 						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 						 errmsg("integer out of range")));
+			result = count + 1;
 		}
 		else
 			result = ((float8) count * (operand - bound1) / (bound2 - bound1)) + 1;
@@ -2779,12 +2779,12 @@ width_bucket_float8(PG_FUNCTION_ARGS)
 			result = 0;
 		else if (operand <= bound2)
 		{
-			result = count + 1;
 			/* check for overflow */
-			if (result < count)
+			if (PG_INT32_ADD_OVERFLOWS(count, 1))
 				ereport(ERROR,
 						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 						 errmsg("integer out of range")));
+			result = count + 1;
 		}
 		else
 			result = ((float8) count * (bound1 - operand) / (bound1 - bound2)) + 1;
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 1a91b29..8790169 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -641,12 +641,8 @@ int4pl(PG_FUNCTION_ARGS)
 
 	result = arg1 + arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of different signs then their sum
-	 * cannot overflow.  If the inputs are of the same sign, their sum had
-	 * better be that sign too.
-	 */
-	if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT32_ADD_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
@@ -662,12 +658,8 @@ int4mi(PG_FUNCTION_ARGS)
 
 	result = arg1 - arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of the same sign then their
-	 * difference cannot overflow.  If they are of different signs then the
-	 * result should be of the same sign as the first input.
-	 */
-	if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT32_SUB_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
@@ -683,22 +675,8 @@ int4mul(PG_FUNCTION_ARGS)
 
 	result = arg1 * arg2;
 
-	/*
-	 * Overflow check.  We basically check to see if result / arg2 gives arg1
-	 * again.  There are two cases where this fails: arg2 = 0 (which cannot
-	 * overflow) and arg1 = INT_MIN, arg2 = -1 (where the division itself will
-	 * overflow and thus incorrectly match).
-	 *
-	 * Since the division is likely much more expensive than the actual
-	 * multiplication, we'd like to skip it where possible.  The best bang for
-	 * the buck seems to be to check whether both inputs are in the int16
-	 * range; if so, no overflow is possible.
-	 */
-	if (!(arg1 >= (int32) SHRT_MIN && arg1 <= (int32) SHRT_MAX &&
-		  arg2 >= (int32) SHRT_MIN && arg2 <= (int32) SHRT_MAX) &&
-		arg2 != 0 &&
-		((arg2 == -1 && arg1 < 0 && result < 0) ||
-		 result / arg2 != arg1))
+	/* Overflow check */
+	if (PG_INT32_MUL_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
@@ -729,12 +707,12 @@ int4div(PG_FUNCTION_ARGS)
 	 */
 	if (arg2 == -1)
 	{
-		result = -arg1;
 		/* overflow check (needed for INT_MIN) */
-		if (arg1 != 0 && SAMESIGN(result, arg1))
+		if (arg1 == PG_INT32_MIN)
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("integer out of range")));
+		result = -arg1;
 		PG_RETURN_INT32(result);
 	}
 
@@ -749,31 +727,25 @@ Datum
 int4inc(PG_FUNCTION_ARGS)
 {
 	int32		arg = PG_GETARG_INT32(0);
-	int32		result;
 
-	result = arg + 1;
-	/* Overflow check */
-	if (arg > 0 && result < 0)
+	if (PG_INT32_ADD_OVERFLOWS(arg, 1))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
 
-	PG_RETURN_INT32(result);
+	PG_RETURN_INT32(arg + 1);
 }
 
 Datum
 int2um(PG_FUNCTION_ARGS)
 {
 	int16		arg = PG_GETARG_INT16(0);
-	int16		result;
 
-	result = -arg;
-	/* overflow check (needed for SHRT_MIN) */
-	if (arg != 0 && SAMESIGN(result, arg))
+	if (arg == SHRT_MIN)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("smallint out of range")));
-	PG_RETURN_INT16(result);
+	PG_RETURN_INT16(-arg);
 }
 
 Datum
@@ -1151,12 +1123,12 @@ int4abs(PG_FUNCTION_ARGS)
 	int32		arg1 = PG_GETARG_INT32(0);
 	int32		result;
 
-	result = (arg1 < 0) ? -arg1 : arg1;
 	/* overflow check (needed for INT_MIN) */
-	if (result < 0)
+	if (arg1 == PG_INT32_MIN)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
+	result = (arg1 < 0) ? -arg1 : arg1;
 	PG_RETURN_INT32(result);
 }
 
@@ -1166,12 +1138,13 @@ int2abs(PG_FUNCTION_ARGS)
 	int16		arg1 = PG_GETARG_INT16(0);
 	int16		result;
 
-	result = (arg1 < 0) ? -arg1 : arg1;
 	/* overflow check (needed for SHRT_MIN) */
-	if (result < 0)
+	if (arg1 == PG_INT16_MIN)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("smallint out of range")));
+	result = (arg1 < 0) ? -arg1 : arg1;
+
 	PG_RETURN_INT16(result);
 }
 
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 63a4fbb..8a0a05a 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -102,19 +102,19 @@ scanint8(const char *str, bool errorOK, int64 *result)
 	/* process digits */
 	while (*ptr && isdigit((unsigned char) *ptr))
 	{
-		int64		newtmp = tmp * 10 + (*ptr++ - '0');
-
-		if ((newtmp / 10) != tmp)		/* overflow? */
+		if (PG_INT64_MUL_OVERFLOWS(tmp, 10) ||
+			PG_INT64_ADD_OVERFLOWS(tmp * 10, *ptr - '0'))
 		{
 			if (errorOK)
 				return false;
 			else
 				ereport(ERROR,
 						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-					   errmsg("value \"%s\" is out of range for type bigint",
-							  str)));
+						 errmsg("value \"%s\" is out of range for type bigint",
+								str)));
 		}
-		tmp = newtmp;
+
+		tmp = tmp * 10 + (*ptr++ - '0');
 	}
 
 gotdigits:
@@ -490,15 +490,12 @@ Datum
 int8um(PG_FUNCTION_ARGS)
 {
 	int64		arg = PG_GETARG_INT64(0);
-	int64		result;
 
-	result = -arg;
-	/* overflow check (needed for INT64_MIN) */
-	if (arg != 0 && SAMESIGN(result, arg))
+	if (arg == PG_INT64_MIN)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
-	PG_RETURN_INT64(result);
+	PG_RETURN_INT64(-arg);
 }
 
 Datum
@@ -516,17 +513,13 @@ int8pl(PG_FUNCTION_ARGS)
 	int64		arg2 = PG_GETARG_INT64(1);
 	int64		result;
 
-	result = arg1 + arg2;
-
-	/*
-	 * Overflow check.  If the inputs are of different signs then their sum
-	 * cannot overflow.  If the inputs are of the same sign, their sum had
-	 * better be that sign too.
-	 */
-	if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_ADD_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
+
+	result = arg1 + arg2;
 	PG_RETURN_INT64(result);
 }
 
@@ -537,17 +530,13 @@ int8mi(PG_FUNCTION_ARGS)
 	int64		arg2 = PG_GETARG_INT64(1);
 	int64		result;
 
-	result = arg1 - arg2;
-
-	/*
-	 * Overflow check.  If the inputs are of the same sign then their
-	 * difference cannot overflow.  If they are of different signs then the
-	 * result should be of the same sign as the first input.
-	 */
-	if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_SUB_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
+
+	result = arg1 - arg2;
 	PG_RETURN_INT64(result);
 }
 
@@ -560,18 +549,8 @@ int8mul(PG_FUNCTION_ARGS)
 
 	result = arg1 * arg2;
 
-	/*
-	 * Overflow check.  We basically check to see if result / arg2 gives arg1
-	 * again.  There are two cases where this fails: arg2 = 0 (which cannot
-	 * overflow) and arg1 = INT64_MIN, arg2 = -1 (where the division itself
-	 * will overflow and thus incorrectly match).
-	 *
-	 * Since the division is likely much more expensive than the actual
-	 * multiplication, we'd like to skip it where possible.  The best bang for
-	 * the buck seems to be to check whether both inputs are in the int32
-	 * range; if so, no overflow is possible.
-	 */
-	if (arg1 != (int64) ((int32) arg1) || arg2 != (int64) ((int32) arg2))
+	/* Overflow check */
+	if (PG_INT64_MUL_OVERFLOWS(arg1, arg2))
 	{
 		if (arg2 != 0 &&
 			((arg2 == -1 && arg1 < 0 && result < 0) ||
@@ -607,12 +586,12 @@ int8div(PG_FUNCTION_ARGS)
 	 */
 	if (arg2 == -1)
 	{
-		result = -arg1;
-		/* overflow check (needed for INT64_MIN) */
-		if (arg1 != 0 && SAMESIGN(result, arg1))
+		if (arg1 == PG_INT64_MIN)
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("bigint out of range")));
+
+		result = -arg1;
 		PG_RETURN_INT64(result);
 	}
 
@@ -632,12 +611,12 @@ int8abs(PG_FUNCTION_ARGS)
 	int64		arg1 = PG_GETARG_INT64(0);
 	int64		result;
 
-	result = (arg1 < 0) ? -arg1 : arg1;
-	/* overflow check (needed for INT64_MIN) */
-	if (result < 0)
+	if (arg1 == PG_INT64_MIN)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
+
+	result = (arg1 < 0) ? -arg1 : arg1;
 	PG_RETURN_INT64(result);
 }
 
@@ -687,15 +666,13 @@ int8inc(PG_FUNCTION_ARGS)
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		int64	   *arg = (int64 *) PG_GETARG_POINTER(0);
-		int64		result;
+		int64		result = *arg;
 
-		result = *arg + 1;
-		/* Overflow check */
-		if (result < 0 && *arg > 0)
+		if (PG_INT64_ADD_OVERFLOWS(result, 1))
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("bigint out of range")));
-
+		result += 1;
 		*arg = result;
 		PG_RETURN_POINTER(arg);
 	}
@@ -704,16 +681,12 @@ int8inc(PG_FUNCTION_ARGS)
 	{
 		/* Not called as an aggregate, so just do it the dumb way */
 		int64		arg = PG_GETARG_INT64(0);
-		int64		result;
 
-		result = arg + 1;
-		/* Overflow check */
-		if (result < 0 && arg > 0)
+		if (PG_INT64_ADD_OVERFLOWS(arg, 1))
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("bigint out of range")));
-
-		PG_RETURN_INT64(result);
+		PG_RETURN_INT64(arg + 1);
 	}
 }
 
@@ -823,12 +796,8 @@ int84pl(PG_FUNCTION_ARGS)
 
 	result = arg1 + arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of different signs then their sum
-	 * cannot overflow.  If the inputs are of the same sign, their sum had
-	 * better be that sign too.
-	 */
-	if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_ADD_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -844,12 +813,8 @@ int84mi(PG_FUNCTION_ARGS)
 
 	result = arg1 - arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of the same sign then their
-	 * difference cannot overflow.  If they are of different signs then the
-	 * result should be of the same sign as the first input.
-	 */
-	if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_SUB_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -865,18 +830,8 @@ int84mul(PG_FUNCTION_ARGS)
 
 	result = arg1 * arg2;
 
-	/*
-	 * Overflow check.  We basically check to see if result / arg1 gives arg2
-	 * again.  There is one case where this fails: arg1 = 0 (which cannot
-	 * overflow).
-	 *
-	 * Since the division is likely much more expensive than the actual
-	 * multiplication, we'd like to skip it where possible.  The best bang for
-	 * the buck seems to be to check whether both inputs are in the int32
-	 * range; if so, no overflow is possible.
-	 */
-	if (arg1 != (int64) ((int32) arg1) &&
-		result / arg1 != arg2)
+	/* overflow check */
+	if (PG_INT64_MUL_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -907,13 +862,11 @@ int84div(PG_FUNCTION_ARGS)
 	 */
 	if (arg2 == -1)
 	{
-		result = -arg1;
-		/* overflow check (needed for INT64_MIN) */
-		if (arg1 != 0 && SAMESIGN(result, arg1))
+		if (arg1 == PG_INT64_MIN)
 			ereport(ERROR,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("bigint out of range")));
-		PG_RETURN_INT64(result);
+		PG_RETURN_INT64(-arg1);
 	}
 
 	/* No overflow is possible */
@@ -932,12 +885,8 @@ int48pl(PG_FUNCTION_ARGS)
 
 	result = arg1 + arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of different signs then their sum
-	 * cannot overflow.  If the inputs are of the same sign, their sum had
-	 * better be that sign too.
-	 */
-	if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_ADD_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -953,12 +902,8 @@ int48mi(PG_FUNCTION_ARGS)
 
 	result = arg1 - arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of the same sign then their
-	 * difference cannot overflow.  If they are of different signs then the
-	 * result should be of the same sign as the first input.
-	 */
-	if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_SUB_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -974,18 +919,8 @@ int48mul(PG_FUNCTION_ARGS)
 
 	result = arg1 * arg2;
 
-	/*
-	 * Overflow check.  We basically check to see if result / arg2 gives arg1
-	 * again.  There is one case where this fails: arg2 = 0 (which cannot
-	 * overflow).
-	 *
-	 * Since the division is likely much more expensive than the actual
-	 * multiplication, we'd like to skip it where possible.  The best bang for
-	 * the buck seems to be to check whether both inputs are in the int32
-	 * range; if so, no overflow is possible.
-	 */
-	if (arg2 != (int64) ((int32) arg2) &&
-		result / arg2 != arg1)
+	/* overflow check */
+	if (PG_INT64_MUL_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -1018,17 +953,13 @@ int82pl(PG_FUNCTION_ARGS)
 	int16		arg2 = PG_GETARG_INT16(1);
 	int64		result;
 
-	result = arg1 + arg2;
-
-	/*
-	 * Overflow check.  If the inputs are of different signs then their sum
-	 * cannot overflow.  If the inputs are of the same sign, their sum had
-	 * better be that sign too.
-	 */
-	if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_ADD_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
+
+	result = arg1 + arg2;
 	PG_RETURN_INT64(result);
 }
 
@@ -1039,17 +970,13 @@ int82mi(PG_FUNCTION_ARGS)
 	int16		arg2 = PG_GETARG_INT16(1);
 	int64		result;
 
-	result = arg1 - arg2;
-
-	/*
-	 * Overflow check.  If the inputs are of the same sign then their
-	 * difference cannot overflow.  If they are of different signs then the
-	 * result should be of the same sign as the first input.
-	 */
-	if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_SUB_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
+
+	result = arg1 - arg2;
 	PG_RETURN_INT64(result);
 }
 
@@ -1060,23 +987,13 @@ int82mul(PG_FUNCTION_ARGS)
 	int16		arg2 = PG_GETARG_INT16(1);
 	int64		result;
 
-	result = arg1 * arg2;
-
-	/*
-	 * Overflow check.  We basically check to see if result / arg1 gives arg2
-	 * again.  There is one case where this fails: arg1 = 0 (which cannot
-	 * overflow).
-	 *
-	 * Since the division is likely much more expensive than the actual
-	 * multiplication, we'd like to skip it where possible.  The best bang for
-	 * the buck seems to be to check whether both inputs are in the int32
-	 * range; if so, no overflow is possible.
-	 */
-	if (arg1 != (int64) ((int32) arg1) &&
-		result / arg1 != arg2)
+	/* Overflow check */
+	if (PG_INT64_MUL_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
+
+	result = arg1 * arg2;
 	PG_RETURN_INT64(result);
 }
 
@@ -1129,12 +1046,8 @@ int28pl(PG_FUNCTION_ARGS)
 
 	result = arg1 + arg2;
 
-	/*
-	 * Overflow check.  If the inputs are of different signs then their sum
-	 * cannot overflow.  If the inputs are of the same sign, their sum had
-	 * better be that sign too.
-	 */
-	if (SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	/* Overflow check */
+	if (PG_INT64_ADD_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -1155,7 +1068,7 @@ int28mi(PG_FUNCTION_ARGS)
 	 * difference cannot overflow.  If they are of different signs then the
 	 * result should be of the same sign as the first input.
 	 */
-	if (!SAMESIGN(arg1, arg2) && !SAMESIGN(result, arg1))
+	if (PG_INT64_SUB_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
@@ -1171,18 +1084,8 @@ int28mul(PG_FUNCTION_ARGS)
 
 	result = arg1 * arg2;
 
-	/*
-	 * Overflow check.  We basically check to see if result / arg2 gives arg1
-	 * again.  There is one case where this fails: arg2 = 0 (which cannot
-	 * overflow).
-	 *
-	 * Since the division is likely much more expensive than the actual
-	 * multiplication, we'd like to skip it where possible.  The best bang for
-	 * the buck seems to be to check whether both inputs are in the int32
-	 * range; if so, no overflow is possible.
-	 */
-	if (arg2 != (int64) ((int32) arg2) &&
-		result / arg2 != arg1)
+	/* Overflow check */
+	if (PG_INT64_MUL_OVERFLOWS(arg1, arg2))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("bigint out of range")));
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 1667d80..df866b4 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -5197,8 +5197,7 @@ numericvar_to_int64(NumericVar *var, int64 *result)
 	int			ndigits;
 	int			weight;
 	int			i;
-	int64		val,
-				oldval;
+	int64		val;
 	bool		neg;
 	NumericVar	rounded;
 
@@ -5224,37 +5223,34 @@ numericvar_to_int64(NumericVar *var, int64 *result)
 	weight = rounded.weight;
 	Assert(weight >= 0 && ndigits <= weight + 1);
 
-	/* Construct the result */
+	/*
+	 * Construct the result by accumulating the absolute value in "val" as
+	 * a negative value to avoid overflow with PG_INT64_MIN.
+	 */
 	digits = rounded.digits;
 	neg = (rounded.sign == NUMERIC_NEG);
-	val = digits[0];
+	val = -digits[0];
 	for (i = 1; i <= weight; i++)
 	{
-		oldval = val;
-		val *= NBASE;
-		if (i < ndigits)
-			val += digits[i];
-
-		/*
-		 * The overflow check is a bit tricky because we want to accept
-		 * INT64_MIN, which will overflow the positive accumulator.  We can
-		 * detect this case easily though because INT64_MIN is the only
-		 * nonzero value for which -val == val (on a two's complement machine,
-		 * anyway).
-		 */
-		if ((val / NBASE) != oldval)	/* possible overflow? */
-		{
-			if (!neg || (-val) != val || val == 0 || oldval < 0)
+		NumericDigit digit = (i < ndigits) ? digits[i] : 0;
+		if (PG_INT64_MUL_OVERFLOWS(val, NBASE) ||
+			PG_INT64_SUB_OVERFLOWS(val * NBASE, digit))
 			{
 				free_var(&rounded);
 				return false;
 			}
-		}
+		val = val * NBASE - digit;
 	}
 
 	free_var(&rounded);
 
-	*result = neg ? -val : val;
+	if (!neg && val == PG_INT64_MIN)
+		/* overflows signed int64 */
+		return false;
+	else if (!neg)
+		*result = -val;
+	else
+		*result = val;
 	return true;
 }
 
diff --git a/src/backend/utils/adt/oracle_compat.c b/src/backend/utils/adt/oracle_compat.c
index 8e896eb..afdbd44 100644
--- a/src/backend/utils/adt/oracle_compat.c
+++ b/src/backend/utils/adt/oracle_compat.c
@@ -175,14 +175,14 @@ lpad(PG_FUNCTION_ARGS)
 	if (s2len <= 0)
 		len = s1len;			/* nothing to pad with, so don't pad */
 
-	bytelen = pg_database_encoding_max_length() * len;
-
 	/* check for integer overflow */
-	if (len != 0 && bytelen / pg_database_encoding_max_length() != len)
+	if (len > PG_INT32_MAX / pg_database_encoding_max_length() - VARHDRSZ)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("requested length too large")));
 
+	bytelen = pg_database_encoding_max_length() * len;
+
 	ret = (text *) palloc(VARHDRSZ + bytelen);
 
 	m = len - s1len;
@@ -1041,24 +1041,25 @@ repeat(PG_FUNCTION_ARGS)
 	char	   *cp,
 			   *sp;
 
+	slen = VARSIZE_ANY_EXHDR(string);
+
 	if (count < 0)
 		count = 0;
 
-	slen = VARSIZE_ANY_EXHDR(string);
-	tlen = VARHDRSZ + (count * slen);
-
-	/* Check for integer overflow */
-	if (slen != 0 && count != 0)
+	else if (slen != 0 &&
+			 count > (PG_INT32_MAX - VARHDRSZ) / slen)
 	{
-		int			check = count * slen;
-		int			check2 = check + VARHDRSZ;
-
-		if ((check / slen) != count || check2 <= check)
-			ereport(ERROR,
-					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-					 errmsg("requested length too large")));
+		/*
+		 * The palloc will actually fail at a lower value but we must protect
+		 * against signed integer overflow separately
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+				 errmsg("requested length too large")));
 	}
 
+	tlen = VARHDRSZ + (count * slen);
+
 	result = (text *) palloc(tlen);
 
 	SET_VARSIZE(result, tlen);
diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c
index 77b05c8..2583ab4 100644
--- a/src/backend/utils/adt/varbit.c
+++ b/src/backend/utils/adt/varbit.c
@@ -1054,16 +1054,12 @@ bitsubstring(VarBit *arg, int32 s, int32 l, bool length_not_specified)
 	}
 	else
 	{
-		e = s + l;
-
-		/*
-		 * A negative value for L is the only way for the end position to be
-		 * before the start. SQL99 says to throw an error.
-		 */
-		if (e < s)
+		/* SQL99 says to throw an error. */
+		if (l < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_SUBSTRING_ERROR),
 					 errmsg("negative substring length not allowed")));
+		e = s + l;
 		e1 = Min(e, bitlen + 1);
 	}
 	if (s1 > bitlen || e1 <= s1)
@@ -1166,12 +1162,14 @@ bit_overlay(VarBit *t1, VarBit *t2, int sp, int sl)
 		ereport(ERROR,
 				(errcode(ERRCODE_SUBSTRING_ERROR),
 				 errmsg("negative substring length not allowed")));
-	sp_pl_sl = sp + sl;
-	if (sp_pl_sl <= sl)
+
+	if (PG_INT32_ADD_OVERFLOWS(sp, sl))
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
 
+	sp_pl_sl = sp + sl;
+
 	s1 = bitsubstring(t1, 1, sp - 1, false);
 	s2 = bitsubstring(t1, sp_pl_sl, -1, true);
 	result = bit_catenate(s1, t2);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d545c34..7d8bf99 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -818,32 +818,36 @@ text_substring(Datum str, int32 start, int32 length, bool length_not_specified)
 	{
 		S1 = Max(S, 1);
 
-		if (length_not_specified)		/* special case - get length to end of
-										 * string */
+		/* special case - get length to end of string */
+		if (length_not_specified)
 			L1 = -1;
+		else if (length < 0)
+		{
+			/* SQL99 says to throw an error. */
+			ereport(ERROR,
+					(errcode(ERRCODE_SUBSTRING_ERROR),
+					 errmsg("negative substring length not allowed")));
+		}
+		else if (PG_INT32_MAX - length < start)
+		{
+			/*
+			 * overflow (but the string can't be that large so just get length
+			 * to end of string) */
+			L1 = -1;
+		}
 		else
 		{
-			/* end position */
-			int			E = S + length;
-
 			/*
-			 * A negative value for L is the only way for the end position to
-			 * be before the start. SQL99 says to throw an error.
+			 * Calculate length adjusted to actual start of string (input
+			 * start could have been negative) and note that according to
+			 * SQL99 we should return an empty string if the entire string is
+			 * left of 1.
 			 */
-			if (E < S)
-				ereport(ERROR,
-						(errcode(ERRCODE_SUBSTRING_ERROR),
-						 errmsg("negative substring length not allowed")));
 
-			/*
-			 * A zero or negative value for the end position can happen if the
-			 * start was negative or one. SQL99 says to return a zero-length
-			 * string.
-			 */
-			if (E < 1)
-				return cstring_to_text("");
+			L1 = S + length - S1;
 
-			L1 = E - S1;
+			if (L1 <= 0)
+				return cstring_to_text("");
 		}
 
 		/*
@@ -883,21 +887,29 @@ text_substring(Datum str, int32 start, int32 length, bool length_not_specified)
 		 */
 		slice_start = 0;
 
-		if (length_not_specified)		/* special case - get length to end of
-										 * string */
+		if (length_not_specified)
+		{
+			/* special case - get length to end of string */
 			slice_size = L1 = -1;
-		else
+		}
+		else if (length < 0)
+		{
+			/* SQL99 says to throw an error. */
+			ereport(ERROR,
+					(errcode(ERRCODE_SUBSTRING_ERROR),
+					 errmsg("negative substring length not allowed")));
+		}
+		else if (PG_INT32_MAX - length < start)
 		{
-			int			E = S + length;
-
 			/*
-			 * A negative value for L is the only way for the end position to
-			 * be before the start. SQL99 says to throw an error.
+			 * Overflow but the string can't be that large so just get length
+			 * to end of string.
 			 */
-			if (E < S)
-				ereport(ERROR,
-						(errcode(ERRCODE_SUBSTRING_ERROR),
-						 errmsg("negative substring length not allowed")));
+			slice_size = L1 = -1;
+		}
+		else
+		{
+			int			E = S + length;
 
 			/*
 			 * A zero or negative value for the end position can happen if the
@@ -1042,12 +1054,14 @@ text_overlay(text *t1, text *t2, int sp, int sl)
 		ereport(ERROR,
 				(errcode(ERRCODE_SUBSTRING_ERROR),
 				 errmsg("negative substring length not allowed")));
-	sp_pl_sl = sp + sl;
-	if (sp_pl_sl <= sl)
+
+	if (PG_INT32_MAX - sp < sl)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
 
+	sp_pl_sl = sp + sl;
+
 	s1 = text_substring(PointerGetDatum(t1), 1, sp - 1, false);
 	s2 = text_substring(PointerGetDatum(t1), sp_pl_sl, -1, true);
 	result = text_catenate(s1, t2);
@@ -2573,21 +2587,27 @@ bytea_substring(Datum str,
 		 */
 		L1 = -1;
 	}
+	else if (L < 0)
+	{
+		/* SQL99 says to throw an error. */
+		ereport(ERROR,
+				(errcode(ERRCODE_SUBSTRING_ERROR),
+				 errmsg("negative substring length not allowed")));
+	}
+	else if (PG_INT32_MAX - L < S)
+	{
+		/*
+		 * Overflow, but the string can't be so large so just fetch to end of
+		 * the string.
+		 */
+		L1 = -1;
+	}
 	else
 	{
 		/* end position */
 		int			E = S + L;
 
 		/*
-		 * A negative value for L is the only way for the end position to be
-		 * before the start. SQL99 says to throw an error.
-		 */
-		if (E < S)
-			ereport(ERROR,
-					(errcode(ERRCODE_SUBSTRING_ERROR),
-					 errmsg("negative substring length not allowed")));
-
-		/*
 		 * A zero or negative value for the end position can happen if the
 		 * start was negative or one. SQL99 says to return a zero-length
 		 * string.
@@ -2653,12 +2673,14 @@ bytea_overlay(bytea *t1, bytea *t2, int sp, int sl)
 		ereport(ERROR,
 				(errcode(ERRCODE_SUBSTRING_ERROR),
 				 errmsg("negative substring length not allowed")));
-	sp_pl_sl = sp + sl;
-	if (sp_pl_sl <= sl)
+
+	if (PG_INT32_MAX - sp < sl)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("integer out of range")));
 
+	sp_pl_sl = sp + sl;
+
 	s1 = bytea_substring(PointerGetDatum(t1), 1, sp - 1, false);
 	s2 = bytea_substring(PointerGetDatum(t1), sp_pl_sl, -1, true);
 	result = bytea_catenate(s1, t2);
diff --git a/src/include/c.h b/src/include/c.h
index 8163b00..bb01121 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -793,6 +793,68 @@ typedef NameData *Name;
 #define Abs(x)			((x) >= 0 ? (x) : -(x))
 
 /*
+ * Detect overflow for signed INT32 and INT64
+ *
+ * Note that this has to be done before doing the suspect arithmetic rather
+ * than afterwards by examining the signs because signed overflow is not well
+ * defined and compilers take liberties to optimize away the checks.
+ *
+ * Also note that SUB_OVERFLOWS is not just the same as doing ADD_OVERFLOWS
+ * with -b because if b = INT_MIN then that would itself cause an overflow...
+ */
+
+#define PG_INT32_ADD_OVERFLOWS(a,b) (					\
+		((a)>0 && (b)>0 && (a) > PG_INT32_MAX - (b)) ||		\
+		((a)<0 && (b)<0 && (a) < PG_INT32_MIN - (b)))
+
+#define PG_INT32_SUB_OVERFLOWS(a,b) (					\
+		((a)<0 && (b)>0 && (a) < PG_INT32_MIN + (b)) ||		\
+		((a)>0 && (b)<0 && (a) > PG_INT32_MAX + (b)))
+
+#define PG_INT64_ADD_OVERFLOWS(a,b) (					\
+		((a)>0 && (b)>0 && (a) > PG_INT64_MAX - (b)) ||	\
+		((a)<0 && (b)<0 && (a) < PG_INT64_MIN - (b)))
+
+#define PG_INT64_SUB_OVERFLOWS(a,b) (					\
+		((a)<0 && (b)>0 && (a) < PG_INT64_MIN + (b)) ||	\
+		((a)>0 && (b)<0 && (a) > PG_INT64_MAX + (b)))
+
+/*
+ * Overflow can only happen if at least one value is outside the range
+ * sqrt(min)..sqrt(max) so check that first as the division can be quite a bit
+ * more expensive than the multiplication.
+ *
+ * Multiplying by 0 or 1 can't overflow of course and checking for 0
+ * separately avoids any risk of dividing by 0.  Be careful about dividing
+ * PG_INT32_MIN by -1 also, note reversing the a and b to ensure we're always
+ * dividing it by a positive value.
+ */
+
+#define PG_INT32_MUL_OVERFLOWS(a,b) (								\
+		((a) > SHRT_MAX || (a) < SHRT_MIN  ||						\
+		 (b) > SHRT_MAX || (b) < SHRT_MIN) &&						\
+		(a) != 0 && (a) != 1 && (b) != 0 && (b) != 1 &&				\
+		(															\
+			((a) > 0 && (b) > 0 && (a) > PG_INT32_MAX / (b)) ||		\
+			((a) > 0 && (b) < 0 && (b) < PG_INT32_MIN / (a)) ||		\
+			((a) < 0 && (b) > 0 && (a) < PG_INT32_MIN / (b)) ||		\
+			((a) < 0 && (b) < 0 && (a) < PG_INT32_MAX / (b))		\
+			)														\
+		)
+
+#define PG_INT64_MUL_OVERFLOWS(a,b) (								\
+		((a) > PG_INT32_MAX || (a) < PG_INT32_MIN  ||				\
+		 (b) > PG_INT32_MAX || (b) < PG_INT32_MIN) &&				\
+		(a) != 0 && (a) != 1 && (b) != 0 && (b) != 1 &&				\
+		(															\
+			((a) > 0 && (b) > 0 && (a) > PG_INT64_MAX / (b)) ||		\
+			((a) > 0 && (b) < 0 && (b) < PG_INT64_MIN / (a)) ||		\
+			((a) < 0 && (b) > 0 && (a) < PG_INT64_MIN / (b)) ||		\
+			((a) < 0 && (b) < 0 && (a) < PG_INT64_MAX / (b))		\
+			)														\
+		)
+
+/*
  * StrNCpy
  *	Like standard library function strncpy(), except that result string
  *	is guaranteed to be null-terminated --- that is, at most N-1 bytes
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index c73f20b..77ad1a4 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -2068,13 +2068,13 @@ exec_stmt_fori(PLpgSQL_execstate *estate, PLpgSQL_stmt_fori *stmt)
 		 */
 		if (stmt->reverse)
 		{
-			if ((int32) (loop_value - step_value) > loop_value)
+			if (PG_INT32_SUB_OVERFLOWS(loop_value, step_value))
 				break;
 			loop_value -= step_value;
 		}
 		else
 		{
-			if ((int32) (loop_value + step_value) < loop_value)
+			if (PG_INT32_ADD_OVERFLOWS(loop_value, step_value))
 				break;
 			loop_value += step_value;
 		}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] overflow checks optimized away

Reply via email to