Quoting "Joseph S. Myers" <jos...@codesourcery.com>:
That diff does not appear to relate to undefined behavior. GCC considers these out-of-range conversions to yield an unspecified value, possibly raising an exception, as per Annex F, and does not take the liberty of optimizing on the basis of them being undefined when not in an IEEE mode.
Well, still, the test is wrong in possibly raising an exception there, with no provisions to ignore the exception or catch any signal raised. For the ARCompact, in order to test the floating point emulation better, I had (there are still there in #if 0 /*DEBUG */ blocks) small wrappers for each function to evaluate it once with the hand-optimized version, and once with fp-bit.c, and abort on getting different values. Now, fp-bit generally tries to yield some value that the programmer thought might mean something, whereas the hand-optimized version treats computations of unspecified values as irrelevant. Considering: GLOBAL(fixunsdfsi): mov.w LOCAL(x413),r1 ! bias + 20 mov DBL0H,r0 shll DBL0H mov.l LOCAL(mask),r3 mov #-21,r2 shld r2,DBL0H ! SH4-200 will start this insn in a new cycle bt/s LOCAL(ret0) sub r1,DBL0H cmp/pl DBL0H ! SH4-200 will start this insn in a new cycle and r3,r0 bf/s LOCAL(ignore_low) addc r3,r0 ! uses T == 1; sets implict 1 mov #11,r2 shld DBL0H,r0 ! SH4-200 will start this insn in a new cycle cmp/gt r2,DBL0H add #-32,DBL0H bt LOCAL(retmax) shld DBL0H,DBL0L rts or DBL0L,r0 and: __fixunsdfsi: bbit0 DBL0H,30,.Lret0or1 lsr r2,DBL0H,20 bmsk_s DBL0H,DBL0H,19 sub_s r2,r2,19; 0x3ff+20-0x400 neg_s r3,r2 btst_s r3,10 bset_s DBL0H,DBL0H,20 #ifdef __LITTLE_ENDIAN__ mov.ne DBL0L,DBL0H asl DBL0H,DBL0H,r2 #else asl.eq DBL0H,DBL0H,r2 lsr.ne DBL0H,DBL0H,r3 #endif lsr DBL0L,DBL0L,r3 j_s.d [blink] add.eq r0,r0,r1 .Lret0: j_s.d [blink] mov_l r0,0 .Lret0or1: add_s DBL0H,DBL0H,0x100000 lsr_s DBL0H,DBL0H,30 j_s.d [blink] bmsk_l r0,DBL0H,0 You can see that an SH4-300 can perform software floating point fixunsdfsi in ten cycles, and the SH4-400 (SH4-200 sans FPU) and ARC700 in twelve. Adding any code in order to compute nice, fluffy values for unspecified results would cause a significant performance degradation.