http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52258
Bug #: 52258
Summary: __builtin_isgreaterequal is sometimes signaling on ARM
Classification: Unclassified
Product: gcc
Version: 4.6.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: [email protected]
ReportedBy: [email protected]
Host: armv7l-unknown-linux-gnueabihf
Target: armv7l-unknown-linux-gnueabihf
Build: armv7l-unknown-linux-gnueabihf
__builtin_isgreaterequal is supposed to be non-signaling in case a qNaN is
provided in input. It works when the function is alone, but when combined with
another test, it sometimes triggered an invalid FP exception at -O1 and above
optimization levels.
For example:
int sel_fmax (double x, double y)
{
return __builtin_isgreaterequal(x, y) || isnan(y);
}
At -O0, the corresponding assembly code is:
00000000 <sel_fmax>:
0: b580 push {r7, lr}
2: b084 sub sp, #16
4: af00 add r7, sp, #0
6: ed87 0b02 vstr d0, [r7, #8]
a: ed87 1b00 vstr d1, [r7]
e: ed97 6b02 vldr d6, [r7, #8]
12: ed97 7b00 vldr d7, [r7]
16: eeb4 6b47 vcmp.f64 d6, d7
1a: eef1 fa10 vmrs APSR_nzcv, fpscr
1e: bfac ite ge
20: 2300 movge r3, #0
22: 2301 movlt r3, #1
24: b2db uxtb r3, r3
26: f083 0301 eor.w r3, r3, #1
2a: b2db uxtb r3, r3
2c: 2b00 cmp r3, #0
2e: d106 bne.n 3e <selfmax+0x3e>
30: ed97 0b00 vldr d0, [r7]
34: f7ff fffe bl 0 <__isnan>
34: R_ARM_THM_CALL __isnan
38: 4603 mov r3, r0
3a: 2b00 cmp r3, #0
3c: d002 beq.n 44 <selfmax+0x44>
3e: f04f 0301 mov.w r3, #1
42: e001 b.n 48 <selfmax+0x48>
44: f04f 0300 mov.w r3, #0
48: 4618 mov r0, r3
4a: f107 0710 add.w r7, r7, #16
4e: 46bd mov sp, r7
50: bd80 pop {r7, pc}
52: bf00 nop
At -O1, the corresponding assembly code is:
00000000 <sel_fmax>:
0: b508 push {r3, lr}
2: eeb4 0bc1 vcmpe.f64 d0, d1
6: eef1 fa10 vmrs APSR_nzcv, fpscr
a: da07 bge.n 1c <selfmax+0x1c>
c: eeb0 0b41 vmov.f64 d0, d1
10: f7ff fffe bl 0 <__isnan>
10: R_ARM_THM_CALL __isnan
14: 3000 adds r0, #0
16: bf18 it ne
18: 2001 movne r0, #1
1a: bd08 pop {r3, pc}
1c: f04f 0001 mov.w r0, #1
20: bd08 pop {r3, pc}
22: bf00 nop
Note how the vcmp.f64 is changed into a vcmpe.f64, triggering an invalid
exception. This means that a lot of the FP functions in the GNU libc trigger an
invalid exception where they should not, therefore rendering FP exceptions
unusable on ARM.