Hi,

I wanted to take this opportunity to provide a brief summary of
outstanding work.

> Also there are several trap-producing cases with date types:
> SELECT to_date('100000000', 'CC');
> SELECT to_timestamp('1000000000,999', 'Y,YYY');
> SELECT make_date(-2147483648, 1, 1);

This is resolved with Matthew's patches, which I've rebased, squashed
and attached to this email. They still require a review.

----

> SET temp_buffers TO 1000000000;
>
> CREATE TEMP TABLE t(i int PRIMARY KEY);
> INSERT INTO t VALUES(1);
>
> #4  0x00007f385cdd37f3 in __GI_abort () at ./stdlib/abort.c:79
> #5  0x00005620071c4f51 in __addvsi3 ()
> #6  0x0000562007143f3c in init_htab (hashp=0x562008facb20,
nelem=610070812) at dynahash.c:720
>
> (gdb) f 6
> #6  0x0000560915207f3c in init_htab (hashp=0x560916039930,
nelem=1000000000) at dynahash.c:720
> 720             hctl->high_mask = (nbuckets << 1) - 1;
> (gdb) p nbuckets
> $1 = 1073741824

I've taken a look at this and my current proposal is to convert
`nbuckets` to 64 bit integer which would prevent the overflow. I'm
hoping to look into if this is feasible soon.

----

> CREATE FUNCTION check_foreign_key () RETURNS trigger AS .../refint.so'
LANGUAGE C;
> CREATE TABLE t (i int4 NOT NULL);
> CREATE TRIGGER check_fkey BEFORE DELETE ON t FOR EACH ROW EXECUTE
PROCEDURE
>    check_foreign_key (2147483647, 'cascade', 'i', "ft", "i");
> INSERT INTO t VALUES (1);
> DELETE FROM t;
>
> #4  0x00007f57f0bef7f3 in __GI_abort () at ./stdlib/abort.c:79
> #5  0x00007f57f1671351 in __addvsi3 () from .../src/test/regress/refint.so
> #6  0x00007f57f1670234 in check_foreign_key (fcinfo=0x7ffebf523650) at
refint.c:321
>
> (gdb) f 6
> #6  0x00007f3400ef9234 in check_foreign_key (fcinfo=0x7ffd6e16a600) at
refint.c:321
> 321             nkeys = (nargs - nrefs) / (nrefs + 1);
> (gdb) p nargs
> $1 = 3
> (gdb) p nrefs
> $2 = 2147483647

I have not looked into this yet, though I was unable to reproduce it
immediately.

    test=# CREATE FUNCTION check_foreign_key () RETURNS trigger AS
'.../refint.so' LANGUAGE C;
    ERROR:  could not access file ".../refint.so": No such file or directory

I think I just have to play around with the path.

----

>> Moreover, I tried to use "-ftrapv" on 32-bit Debian and came across
>> another failure:
>> select '9223372036854775807'::int8 * 2147483648::int8;
>> server closed the connection unexpectedly
>> ...
>> #4  0xb722226a in __GI_abort () at ./stdlib/abort.c:79
>> #5  0x004cb2e1 in __mulvdi3.cold ()
>> #6  0x00abe7ab in pg_mul_s64_overflow (a=9223372036854775807,
b=2147483648, result=0xbff1da68)
>>     at ../../../../src/include/common/int.h:264
>> #7  0x00abfbff in int8mul (fcinfo=0x14d9d04) at int8.c:496
>> #8  0x00782675 in ExecInterpExpr (state=0x14d9c4c, econtext=0x14da15c,
isnull=0xbff1dc3f) at execExprInterp.c:765
>
> Hm.  It looks like that is pointing to __builtin_mul_overflow(), which
> seems strange.

Agreed that this looks strange. The docs [0] seem to indicate that this
shouldn't happen.

> These built-in functions promote the first two operands into infinite
> precision signed type and perform addition on those promoted
> operands.
...
> As the addition is performed in infinite signed precision, these
> built-in functions have fully defined behavior for all argument
> values.
...
> The first built-in function allows arbitrary integral types for
> operands and the result type must be pointer to some integral type
> other than enumerated or boolean type

The docs for the mul functions say that they behave the same as
addition. Alexander, is it possible that you're compiling with
something other than GCC?

----

>>> #6  0x00005576cf627c68 in bms_singleton_member (a=0x5576d09f7fb0) at
bitmapset.c:691
>>> 691                             if (result >= 0 || HAS_MULTIPLE_ONES(w))
>>
>> At a glance, this appears to be caused by the RIGHTMOST_ONE macro:
>>
>>       #define RIGHTMOST_ONE(x) ((signedbitmapword) (x) &
-((signedbitmapword) (x)))
>
> I believe hand-rolling the two's complement calculation should be
> sufficient to avoid depending on -fwrapv here.  godbolt.org indicates that
> it produces roughly the same code, too.
>
> diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
> index cd05c642b0..d37a997c0e 100644
> --- a/src/backend/nodes/bitmapset.c
> +++ b/src/backend/nodes/bitmapset.c
> @@ -67,7 +67,7 @@
>   * we get zero.
>   *----------
>   */
> -#define RIGHTMOST_ONE(x) ((signedbitmapword) (x) & -((signedbitmapword)
(x)))
> +#define RIGHTMOST_ONE(x) ((bitmapword) (x) & (~((bitmapword) (x)) + 1))

This approach seems to resolve the issue locally for me, and I think it
falls out cleanly from the comment in the code above.

Thanks,
Joseph Koshakow

[0] https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
From 15e5a79f198a029fc6436c409a00a1592e23a46d Mon Sep 17 00:00:00 2001
From: Matthew Kim <38759997+friendlymatthew@users.noreply.github.com>
Date: Tue, 9 Jul 2024 18:25:10 -0400
Subject: [PATCH v24] Remove dependence on -fwrapv semantics in some date/time
 functions.

This commit updates a few date/time functions, such as
to_timestamp() and make_date(), to no longer rely on signed integer
wrapping for correctness. This is intended to move us closer towards
removing -fwrapv, which may enable some compiler optimizations.
However, there is presently no plan to actually remove that compiler
option in the near future.
---
 src/backend/utils/adt/date.c           |  5 +++-
 src/backend/utils/adt/formatting.c     | 35 +++++++++++++++++++++++---
 src/test/regress/expected/date.out     |  2 ++
 src/test/regress/expected/horology.out |  4 +++
 src/test/regress/sql/date.sql          |  1 +
 src/test/regress/sql/horology.sql      |  2 ++
 6 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
index 9c854e0e5c..baa677fb96 100644
--- a/src/backend/utils/adt/date.c
+++ b/src/backend/utils/adt/date.c
@@ -257,7 +257,10 @@ make_date(PG_FUNCTION_ARGS)
 	if (tm.tm_year < 0)
 	{
 		bc = true;
-		tm.tm_year = -tm.tm_year;
+		if (pg_mul_s32_overflow(tm.tm_year, -1, &tm.tm_year))
+			ereport(ERROR,
+					(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+					 errmsg("date field value out of range")));
 	}
 
 	dterr = ValidateDate(DTK_DATE_M, false, false, bc, &tm);
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 68069fcfd3..e2cb57a9e1 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -77,6 +77,7 @@
 
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/unicode_case.h"
 #include "common/unicode_category.h"
 #include "mb/pg_wchar.h"
@@ -3809,7 +3810,15 @@ DCH_from_char(FormatNode *node, const char *in, TmFromChar *out,
 						ereturn(escontext,,
 								(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
 								 errmsg("invalid input string for \"Y,YYY\"")));
-					years += (millennia * 1000);
+					if (pg_mul_s32_overflow(millennia, 1000, &millennia))
+						ereturn(escontext,,
+								(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
+								 errmsg("invalid input string for \"Y,YYY\"")));
+					if (pg_add_s32_overflow(years, millennia, &years))
+						ereturn(escontext,,
+								(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
+								 errmsg("invalid input string for \"Y,YYY\"")));
+
 					if (!from_char_set_int(&out->year, years, n, escontext))
 						return;
 					out->yysz = 4;
@@ -4797,11 +4806,31 @@ do_to_timestamp(text *date_txt, text *fmt, Oid collid, bool std,
 		if (tmfc.bc)
 			tmfc.cc = -tmfc.cc;
 		if (tmfc.cc >= 0)
+		{
 			/* +1 because 21st century started in 2001 */
-			tm->tm_year = (tmfc.cc - 1) * 100 + 1;
+			/* tm->tm_year = (tmfc.cc - 1) * 100 + 1; */
+			if (pg_mul_s32_overflow((tmfc.cc - 1), 100, &tm->tm_year) ||
+				pg_add_s32_overflow(tm->tm_year, 1, &tm->tm_year))
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+						 errmsg("date out of range: \"%s\"",
+								text_to_cstring(date_txt))));
+			}
+		}
 		else
+		{
 			/* +1 because year == 599 is 600 BC */
-			tm->tm_year = tmfc.cc * 100 + 1;
+			/* tm->tm_year = tmfc.cc * 100 + 1; */
+			if (pg_mul_s32_overflow(tmfc.cc, 100, &tm->tm_year) ||
+				pg_add_s32_overflow(tm->tm_year, 1, &tm->tm_year))
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+						 errmsg("date out of range: \"%s\"",
+								text_to_cstring(date_txt))));
+			}
+		}
 		fmask |= DTK_M(YEAR);
 	}
 
diff --git a/src/test/regress/expected/date.out b/src/test/regress/expected/date.out
index f5949f3d17..9293e045b0 100644
--- a/src/test/regress/expected/date.out
+++ b/src/test/regress/expected/date.out
@@ -1532,3 +1532,5 @@ select make_time(10, 55, 100.1);
 ERROR:  time field value out of range: 10:55:100.1
 select make_time(24, 0, 2.1);
 ERROR:  time field value out of range: 24:00:2.1
+SELECT make_date(-2147483648, 1, 1);
+ERROR:  date field value out of range
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index 241713cc51..df02d268c0 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -3448,6 +3448,8 @@ SELECT i, to_timestamp('2018-11-02 12:34:56.123456', 'YYYY-MM-DD HH24:MI:SS.FF'
 
 SELECT i, to_timestamp('2018-11-02 12:34:56.123456789', 'YYYY-MM-DD HH24:MI:SS.FF' || i) FROM generate_series(1, 6) i;
 ERROR:  date/time field value out of range: "2018-11-02 12:34:56.123456789"
+SELECT to_timestamp('1000000000,999', 'Y,YYY');
+ERROR:  invalid input string for "Y,YYY"
 SELECT to_date('1 4 1902', 'Q MM YYYY');  -- Q is ignored
   to_date   
 ------------
@@ -3778,6 +3780,8 @@ SELECT to_date('0000-02-01','YYYY-MM-DD');  -- allowed, though it shouldn't be
  02-01-0001 BC
 (1 row)
 
+SELECT to_date('100000000', 'CC');
+ERROR:  date out of range: "100000000"
 -- to_char's TZ format code produces zone abbrev if known
 SELECT to_char('2012-12-12 12:00'::timestamptz, 'YYYY-MM-DD HH:MI:SS TZ');
          to_char         
diff --git a/src/test/regress/sql/date.sql b/src/test/regress/sql/date.sql
index 1c58ff6966..9a4e5832b9 100644
--- a/src/test/regress/sql/date.sql
+++ b/src/test/regress/sql/date.sql
@@ -373,3 +373,4 @@ select make_date(2013, 13, 1);
 select make_date(2013, 11, -1);
 select make_time(10, 55, 100.1);
 select make_time(24, 0, 2.1);
+SELECT make_date(-2147483648, 1, 1);
diff --git a/src/test/regress/sql/horology.sql b/src/test/regress/sql/horology.sql
index e5cf12ff63..db532ee3c0 100644
--- a/src/test/regress/sql/horology.sql
+++ b/src/test/regress/sql/horology.sql
@@ -558,6 +558,7 @@ SELECT i, to_timestamp('2018-11-02 12:34:56.1234', 'YYYY-MM-DD HH24:MI:SS.FF' ||
 SELECT i, to_timestamp('2018-11-02 12:34:56.12345', 'YYYY-MM-DD HH24:MI:SS.FF' || i) FROM generate_series(1, 6) i;
 SELECT i, to_timestamp('2018-11-02 12:34:56.123456', 'YYYY-MM-DD HH24:MI:SS.FF' || i) FROM generate_series(1, 6) i;
 SELECT i, to_timestamp('2018-11-02 12:34:56.123456789', 'YYYY-MM-DD HH24:MI:SS.FF' || i) FROM generate_series(1, 6) i;
+SELECT to_timestamp('1000000000,999', 'Y,YYY');
 
 SELECT to_date('1 4 1902', 'Q MM YYYY');  -- Q is ignored
 SELECT to_date('3 4 21 01', 'W MM CC YY');
@@ -660,6 +661,7 @@ SELECT to_date('2016 365', 'YYYY DDD');  -- ok
 SELECT to_date('2016 366', 'YYYY DDD');  -- ok
 SELECT to_date('2016 367', 'YYYY DDD');
 SELECT to_date('0000-02-01','YYYY-MM-DD');  -- allowed, though it shouldn't be
+SELECT to_date('100000000', 'CC');
 
 -- to_char's TZ format code produces zone abbrev if known
 SELECT to_char('2012-12-12 12:00'::timestamptz, 'YYYY-MM-DD HH:MI:SS TZ');
-- 
2.34.1

Reply via email to