Bonjour Michaël,

You are right as well that having symmetry with the signed methods is much better. In order to see the difference, you can just do that with the extension attached, after of course hijacking int.h with some undefs and recompiling the backend and the module: select pg_overflow_check(10000, 10000, 2000000000, 'uint32', 'mul');

Ok.

still it is possible to trick things with signed integer arguments.

Is it?

If you pass -1 and then you can fall back to the maximum of each 16,
32 or 64 bits for the unsigned (see the regression tests I added with
the module).

Attached is also an updated version of the module I used to validate
this stuff.  Fabien, any thoughts?

Patch apply cleanly, compiles, "make check" ok (although changes are untested).

I would put back unlikely() on overflow tests, as there are indeed unlikely to occur and it may help some compilers, and cannot be harmful. It also helps the code reader to know that these path are not expected to be taken often.

On reflection, I'm not sure that add_u64 and sub_u64 overflow with uint128 are useful. The res < a or b > a tricks should suffice, just like for u16 and u32 cases, and it may cost a little less anyway.

I would suggest keep the overflow extension as "contrib/overflow_test". For mul tests, I'd suggest not to try only min/max values like add/sub, but also "standard" multiplications that overflow or not. It would be good if "make check" could be made to work", for some reason it requires "installcheck".

I could not test performance directly, loops are optimized out by the compiler. I added "volatile" on input value declarations to work around that. On 2B iterations I got on my laptop:

 int16: mul = 2770 ms, add = 1830 ms, sub = 1826 ms
 int32: mul = 1838 ms, add = 1835 ms, sub = 1840 ms
 int64: mul = 1836 ms, add = 1834 ms, sub = 1833 ms

 uint16: mul = 3670 ms, add = 1830 ms, sub = 2148 ms
 uint32: mul = 2438 ms, add = 1834 ms, sub = 1831 ms
 uint64: mul = 2139 ms, add = 1841 ms, sub = 1882 ms

Why int16 mul, uint* mul and uint16 sub are bad is unclear.

With fallback code triggered with:

 #undef HAVE__BUILTIN_OP_OVERFLOW

 int16: mul = 1433 ms, add = 1424 ms, sub = 1254 ms
 int32: mul = 1433 ms, add = 1425 ms, sub = 1443 ms
 int64: mul = 1430 ms, add = 1429 ms, sub = 1441 ms

 uint16: mul = 1445 ms, add = 1291 ms, sub = 1265 ms
 uint32: mul = 1419 ms, add = 1434 ms, sub = 1493 ms
 uint64: mul = 1266 ms, add = 1430 ms, sub = 1440 ms

For some unclear reason, 4 tests are significantly faster.

Forcing further down fallback code with:

  #undef HAVE_INT128

 int64: mul = 1424 ms, add = 1429 ms, sub = 1440 ms
 uint64: mul = 24145 ms, add = 1434 ms, sub = 1435 ms

There is no doubt that dividing 64 bits integers is a very bad idea, at least on my architecture!

Note that checks depends on value, so actual performance may vary depending on actual val1 and val2 passed. I used 10000 10000 like your example.

These results are definitely depressing because the fallback code is nearly twice as fast as the builtin overflow detection version. For the record: gcc 7.4.0 on ubuntu 18.04 LTS. Not sure what to advise, relying on the builtin should be the better idea…

--
Fabien.

Reply via email to