Bonjour Michaël,
You are right as well that having symmetry with the signed methods is
much better. In order to see the difference, you can just do that with
the extension attached, after of course hijacking int.h with some undefs
and recompiling the backend and the module: select
pg_overflow_check(10000, 10000, 2000000000, 'uint32', 'mul');
Ok.
still it is possible to trick things with signed integer arguments.
Is it?
If you pass -1 and then you can fall back to the maximum of each 16,
32 or 64 bits for the unsigned (see the regression tests I added with
the module).
Attached is also an updated version of the module I used to validate
this stuff. Fabien, any thoughts?
Patch apply cleanly, compiles, "make check" ok (although changes
are untested).
I would put back unlikely() on overflow tests, as there are indeed
unlikely to occur and it may help some compilers, and cannot be harmful.
It also helps the code reader to know that these path are not expected to
be taken often.
On reflection, I'm not sure that add_u64 and sub_u64 overflow with uint128
are useful. The res < a or b > a tricks should suffice, just like for u16
and u32 cases, and it may cost a little less anyway.
I would suggest keep the overflow extension as "contrib/overflow_test".
For mul tests, I'd suggest not to try only min/max values like add/sub,
but also "standard" multiplications that overflow or not. It would be good
if "make check" could be made to work", for some reason it requires
"installcheck".
I could not test performance directly, loops are optimized out by the
compiler. I added "volatile" on input value declarations to work around
that. On 2B iterations I got on my laptop:
int16: mul = 2770 ms, add = 1830 ms, sub = 1826 ms
int32: mul = 1838 ms, add = 1835 ms, sub = 1840 ms
int64: mul = 1836 ms, add = 1834 ms, sub = 1833 ms
uint16: mul = 3670 ms, add = 1830 ms, sub = 2148 ms
uint32: mul = 2438 ms, add = 1834 ms, sub = 1831 ms
uint64: mul = 2139 ms, add = 1841 ms, sub = 1882 ms
Why int16 mul, uint* mul and uint16 sub are bad is unclear.
With fallback code triggered with:
#undef HAVE__BUILTIN_OP_OVERFLOW
int16: mul = 1433 ms, add = 1424 ms, sub = 1254 ms
int32: mul = 1433 ms, add = 1425 ms, sub = 1443 ms
int64: mul = 1430 ms, add = 1429 ms, sub = 1441 ms
uint16: mul = 1445 ms, add = 1291 ms, sub = 1265 ms
uint32: mul = 1419 ms, add = 1434 ms, sub = 1493 ms
uint64: mul = 1266 ms, add = 1430 ms, sub = 1440 ms
For some unclear reason, 4 tests are significantly faster.
Forcing further down fallback code with:
#undef HAVE_INT128
int64: mul = 1424 ms, add = 1429 ms, sub = 1440 ms
uint64: mul = 24145 ms, add = 1434 ms, sub = 1435 ms
There is no doubt that dividing 64 bits integers is a very bad idea, at
least on my architecture!
Note that checks depends on value, so actual performance may vary
depending on actual val1 and val2 passed. I used 10000 10000 like your
example.
These results are definitely depressing because the fallback code is
nearly twice as fast as the builtin overflow detection version. For the
record: gcc 7.4.0 on ubuntu 18.04 LTS. Not sure what to advise, relying on
the builtin should be the better idea…
--
Fabien.