Yeah you are right about the integer overflow.

Consider this query: select date_bin('15 minutes'::interval, timestamp
'294276-12-30 10:24:00', timestamp '4000-12-20 23:00:00 BC');

It will return 294276-12-30 10:31:49.551616 when it should be 294276-12-30
10:15:00, which happens because source timestamp is close to INT64_MAX and
origin timestamp is negative, causing an integer overflow.
So, the subsequent calculations are wrong.

I fixed the integer overflow and original bug and added test cases in my 3
patches in this reply below:

v2-0001:
- Fixed both the original bug discussed and the integer overflow issues
(and used your suggestion to store the modulo result).
- Any timestamp used will output the correct date_bin() result.
- I have used INT64 -> UINT64 mapping in order to ensure no integer
overflows are possible.
- The only additional cost is 3 subtractions/addition to do the INT64 ->
UINT64 mapping for timestamp, origin and final result.
- It is probably possible to tackle the integer overflow issue without
casting but, from my attempts, it seemed much more convoluted and complex.
- Implemented the fix for both timestamp_bin() and timestamptz_bin().

v2-0002:
- Added multiple test cases in timestamp.sql for integer overflow by
testing with timestamps around INT64_MIN and INT64_MAX.
- Added test case for the original bug where source timestamp is a valid
binned date already and does not require the additional stride interval
subtraction.

v2-0003:
- Exactly the same as v2-0002 but for timestamptz.sql


Also, I would like to create a new patch on the 2024-03 commitfest, but
since I just created my account yesterday I get this error:
"The site you are trying to log in to (commitfest.postgresql.org) requires
a cool-off period between account creation and logging in. You have not
passed the cool off period yet."

How long is the cool off period, so that I can create a new patch in the
commitfest before submissions close after tomorrow.
Alternatively, is it possible for someone to open a new patch on my behalf
linking this email thread, so it can be added to the 2024-03 commitfest?


Best regards,
Moaaz Assali


On Tue, Feb 27, 2024 at 11:13 PM Tom Lane <t...@sss.pgh.pa.us> wrote:

> I wrote:
> > Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
> > attempt to account for this that is probably redundant given the
> > additional condition.  Also, can we avoid computing tm_diff %
> > stride_usecs twice?  Maybe the compiler is smart enough to remove the
> > common subexpression, but it's a mighty expensive computation if not.
>
> I think we could do it like this:
>
>     tm_diff = timestamp - origin;
>     tm_modulo = tm_diff % stride_usecs;
>     tm_delta = tm_diff - tm_modulo;
>     /* We want to round towards -infinity when tm_diff is negative */
>     if (tm_modulo < 0)
>         tm_delta -= stride_usecs;
>
> Excluding tm_modulo == 0 from the correction is what fixes the
> problem.
>
> > I'm also itching a bit over whether there are integer-overflow
> > hazards here.  Maybe the range of timestamp is constrained enough
> > that there aren't, but I didn't look hard.
>
> Hmm, it's not.  For instance this triggers the overflow check in
> timestamp_mi:
>
> regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC';
> ERROR:  interval out of range
> regression=# \errverbose
> ERROR:  22008: interval out of range
> LOCATION:  timestamp_mi, timestamp.c:2832
>
> So we ought to guard the subtraction that produces tm_diff similarly.
> I suspect it's also possible to overflow int64 while computing
> stride_usecs.
>
>                         regards, tom lane
>

Attachment: v2-0002-Regression-test-in-timestamp.sql-for-date_bin.patch
Description: Binary data

Attachment: v2-0001-Fix-for-edge-cases-and-integer-overflow-in-date_bin.patch
Description: Binary data

Attachment: v2-0003-Regression-test-in-timestamptz.sql-for-date_bin.patch
Description: Binary data

Reply via email to