Evgeniy Gorbanev reported to the security list that he'd found
a case where timestamp_in triggered an undefined-behavior
sanitizer warning, due to trying to store a float value larger
than INT_MAX into an integer variable.  We concluded that there's
no real security issue there, it's just that the C standard
doesn't say what the result should be.  So I'm moving this
discussion to the public list, because it's also not entirely
clear what the fix should be.

Evgeniy's example case was

SELECT 'j 05 t a6.424e5-'::timestamp;

which of course is garbage and draws an "invalid input syntax"
error as-expected, though not till after DecodeNumberField
has tried to stuff ".424e5 * 1000000" into an integer.  However,
I found that there are closely related cases that don't draw
any syntax error, such as

regression=# SELECT timestamp with time zone 'J2452271 T X03456-08';
      timestamptz       
------------------------
 2001-12-27 00:34:56-08
(1 row)

regression=# SELECT timestamp with time zone 'J2452271 T X03456.001e6-08';
      timestamptz       
------------------------
 2001-12-27 00:51:36-08
(1 row)

The fundamental problem here, IMO, is that DecodeNumberField assumes
without checking that its input contains only digits and perhaps a
decimal point.  In this example though, it's given whatever remains
after stripping the run-on timezone spec, that is "X03456.001e6".
That triggers both the overflow problem with the bogus ".001e6"
fraction, and a totally inappropriate reading of "X0" as "hour zero".

What I think we ought to do about this is reject strings containing
non-digits, as in the attached patch.  However, there's no doubt that
that'd make timestamp_in reject some inputs it used to accept, and
we've generally tried to avoid that --- commit f0d0394e8 for instance
is a recent effort in the direction of not breaking backward
compatibility, and it adopted a position that we should not break
cases that have a sensible interpretation, even if they're surprising.
The difference here is that I don't think there's a sensible
interpretation; or if there is, it's certainly not what the code
does today.

So what I propose we do about this is to apply the attached to HEAD
and leave the back branches alone.  Right now, inputs like this are
"garbage in, garbage out" but don't cause any real problem (upsetting
a sanitizer check isn't of concern for production use).  So I'm not
seeing that there's a good argument for back-patching.

We could alternatively apply some more-limited fix; Evgeniy's
original recommendation was just to reject if the result of
strtod() was outside the expected 0..1 range.  But I'm finding
it hard to believe that that's a great idea.

Comments?

                        regards, tom lane

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 793d8a9adcc..ef841a1a454 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -702,9 +702,18 @@ ParseFraction(char *cp, double *frac)
 	}
 	else
 	{
+		/*
+		 * On the other hand, let's reject anything that's not digits after
+		 * the ".".  strtod is happy with input like ".123e9", but that'd
+		 * break callers' expectation that the result is in 0..1.  (It's quite
+		 * difficult to get here with such input, but not impossible.)
+		 */
+		if (strspn(cp + 1, "0123456789") != strlen(cp + 1))
+			return DTERR_BAD_FORMAT;
+
 		errno = 0;
 		*frac = strtod(cp, &cp);
-		/* check for parse failure */
+		/* check for parse failure (probably redundant given prior check) */
 		if (*cp != '\0' || errno != 0)
 			return DTERR_BAD_FORMAT;
 	}
@@ -2964,31 +2973,23 @@ DecodeNumberField(int len, char *str, int fmask,
 	 */
 	if ((cp = strchr(str, '.')) != NULL)
 	{
-		/*
-		 * Can we use ParseFractionalSecond here?  Not clear whether trailing
-		 * junk should be rejected ...
-		 */
-		if (cp[1] == '\0')
-		{
-			/* avoid assuming that strtod will accept "." */
-			*fsec = 0;
-		}
-		else
-		{
-			double		frac;
+		int			dterr;
 
-			errno = 0;
-			frac = strtod(cp, NULL);
-			if (errno != 0)
-				return DTERR_BAD_FORMAT;
-			*fsec = rint(frac * 1000000);
-		}
+		/* Convert the fraction into *fsec */
+		dterr = ParseFractionalSecond(cp, fsec);
+		if (dterr)
+			return dterr;
 		/* Now truncate off the fraction for further processing */
 		*cp = '\0';
 		len = strlen(str);
 	}
+
+	/* What's left had better be all digits */
+	if (strspn(str, "0123456789") != len)
+		return DTERR_BAD_FORMAT;
+
 	/* No decimal point and no complete date yet? */
-	else if ((fmask & DTK_DATE_M) != DTK_DATE_M)
+	if (cp == NULL && (fmask & DTK_DATE_M) != DTK_DATE_M)
 	{
 		if (len >= 6)
 		{
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index b90bfcd794f..5ae93d8e8a5 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -467,6 +467,15 @@ SELECT timestamp with time zone 'Y2001M12D27H04MM05S06.789-08';
 ERROR:  invalid input syntax for type timestamp with time zone: "Y2001M12D27H04MM05S06.789-08"
 LINE 1: SELECT timestamp with time zone 'Y2001M12D27H04MM05S06.789-0...
                                         ^
+-- More examples we used to accept and should not
+SELECT timestamp with time zone 'J2452271 T X03456-08';
+ERROR:  invalid input syntax for type timestamp with time zone: "J2452271 T X03456-08"
+LINE 1: SELECT timestamp with time zone 'J2452271 T X03456-08';
+                                        ^
+SELECT timestamp with time zone 'J2452271 T X03456.001e6-08';
+ERROR:  invalid input syntax for type timestamp with time zone: "J2452271 T X03456.001e6-08"
+LINE 1: SELECT timestamp with time zone 'J2452271 T X03456.001e6-08'...
+                                        ^
 -- conflicting fields should throw errors
 SELECT date '1995-08-06 epoch';
 ERROR:  invalid input syntax for type date: "1995-08-06 epoch"
diff --git a/src/test/regress/sql/horology.sql b/src/test/regress/sql/horology.sql
index 1310b432773..8978249a5dc 100644
--- a/src/test/regress/sql/horology.sql
+++ b/src/test/regress/sql/horology.sql
@@ -102,6 +102,10 @@ SELECT date 'J J 1520447';
 SELECT timestamp with time zone 'Y2001M12D27H04M05S06.789+08';
 SELECT timestamp with time zone 'Y2001M12D27H04MM05S06.789-08';
 
+-- More examples we used to accept and should not
+SELECT timestamp with time zone 'J2452271 T X03456-08';
+SELECT timestamp with time zone 'J2452271 T X03456.001e6-08';
+
 -- conflicting fields should throw errors
 SELECT date '1995-08-06 epoch';
 SELECT date '1995-08-06 infinity';

Reply via email to