Yeah you are right about the integer overflow. Consider this query: select date_bin('15 minutes'::interval, timestamp '294276-12-30 10:24:00', timestamp '4000-12-20 23:00:00 BC');
It will return 294276-12-30 10:31:49.551616 when it should be 294276-12-30 10:15:00, which happens because source timestamp is close to INT64_MAX and origin timestamp is negative, causing an integer overflow. So, the subsequent calculations are wrong. I fixed the integer overflow and original bug and added test cases in my 3 patches in this reply below: v2-0001: - Fixed both the original bug discussed and the integer overflow issues (and used your suggestion to store the modulo result). - Any timestamp used will output the correct date_bin() result. - I have used INT64 -> UINT64 mapping in order to ensure no integer overflows are possible. - The only additional cost is 3 subtractions/addition to do the INT64 -> UINT64 mapping for timestamp, origin and final result. - It is probably possible to tackle the integer overflow issue without casting but, from my attempts, it seemed much more convoluted and complex. - Implemented the fix for both timestamp_bin() and timestamptz_bin(). v2-0002: - Added multiple test cases in timestamp.sql for integer overflow by testing with timestamps around INT64_MIN and INT64_MAX. - Added test case for the original bug where source timestamp is a valid binned date already and does not require the additional stride interval subtraction. v2-0003: - Exactly the same as v2-0002 but for timestamptz.sql Also, I would like to create a new patch on the 2024-03 commitfest, but since I just created my account yesterday I get this error: "The site you are trying to log in to (commitfest.postgresql.org) requires a cool-off period between account creation and logging in. You have not passed the cool off period yet." How long is the cool off period, so that I can create a new patch in the commitfest before submissions close after tomorrow. Alternatively, is it possible for someone to open a new patch on my behalf linking this email thread, so it can be added to the 2024-03 commitfest? Best regards, Moaaz Assali On Tue, Feb 27, 2024 at 11:13 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > I wrote: > > Hmm, yeah. The "stride_usecs > 1" test seems like it's a partial > > attempt to account for this that is probably redundant given the > > additional condition. Also, can we avoid computing tm_diff % > > stride_usecs twice? Maybe the compiler is smart enough to remove the > > common subexpression, but it's a mighty expensive computation if not. > > I think we could do it like this: > > tm_diff = timestamp - origin; > tm_modulo = tm_diff % stride_usecs; > tm_delta = tm_diff - tm_modulo; > /* We want to round towards -infinity when tm_diff is negative */ > if (tm_modulo < 0) > tm_delta -= stride_usecs; > > Excluding tm_modulo == 0 from the correction is what fixes the > problem. > > > I'm also itching a bit over whether there are integer-overflow > > hazards here. Maybe the range of timestamp is constrained enough > > that there aren't, but I didn't look hard. > > Hmm, it's not. For instance this triggers the overflow check in > timestamp_mi: > > regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC'; > ERROR: interval out of range > regression=# \errverbose > ERROR: 22008: interval out of range > LOCATION: timestamp_mi, timestamp.c:2832 > > So we ought to guard the subtraction that produces tm_diff similarly. > I suspect it's also possible to overflow int64 while computing > stride_usecs. > > regards, tom lane >
v2-0002-Regression-test-in-timestamp.sql-for-date_bin.patch
Description: Binary data
v2-0001-Fix-for-edge-cases-and-integer-overflow-in-date_bin.patch
Description: Binary data
v2-0003-Regression-test-in-timestamptz.sql-for-date_bin.patch
Description: Binary data