On Thu, Oct 6, 2011 at 11:31 PM, Florian Weimer <f...@deneb.enyo.de> wrote: > * Ulf Magnusson: > >> I've been experimenting with different methods for emulating the >> signed overflow of an 8-bit CPU. The method I've found that seems to >> generate the most efficient code on both ARM and x86 is >> >> bool overflow(unsigned int a, unsigned int b) { >> const unsigned int sum = (int8_t)a + (int8_t)b; >> return (int8_t)sum != sum; >> } > > There's a GCC extension which is relevant here: > > | For conversion to a type of width N, the value is reduced modulo 2^N > | to be within range of the type; no signal is raised. > > <http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation> > > Using that, you can replace the final "& 0x80" with a signed > comparison to zero, which should be give you the best possible code > (for the generic RISC). You only need to hunt down a copy of Hacker's > Delight or find the right bit twiddling by other means. 8-) >
Are you thinking of something like this? bool overflow_bit2(unsigned int a, unsigned int b) { const unsigned int ashift = a << 24; const unsigned int bshift = b << 24; const unsigned int sum = a + b; return (int)(~(a ^ b) & (a ^ sum)) < 0; } That version generates 80: 180b adds r3, r1, r0 82: 4041 eors r1, r0 84: ea83 0200 eor.w r2, r3, r0 88: ea22 0001 bic.w r0, r2, r1 8c: 0fc0 lsrs r0, r0, #31 8e: 4770 bx lr Whereas the unshifted version generates 40: 180b adds r3, r1, r0 42: 4041 eors r1, r0 44: ea83 0200 eor.w r2, r3, r0 48: ea22 0001 bic.w r0, r2, r1 4c: f3c0 10c0 ubfx r0, r0, #7, #1 50: 4770 bx lr So maybe a bit better. (I'm no ARM pro, but the compiler does seem to take advantage of the fact that it's testing the real sign bit at least.) Btw, & 0x80000000 generates the same code. /Ulf