Hi, This patch improves code generations for builtin arithmetic overflow operations for the aarch64 backend. As an example for a simple test case such as:
int
f (int x, int y, int *ovf)
{
int res;
*ovf = __builtin_sadd_overflow (x, y, &res);
return res;
}
Current trunk at -O2 generates
f:
mov w3, w0
mov w4, 0
add w0, w0, w1
tbnz w1, #31, .L4
cmp w0, w3
blt .L3
.L2:
str w4, [x2]
ret
.p2align 3
.L4:
cmp w0, w3
ble .L2
.L3:
mov w4, 1
b .L2
With the patch this now generates:
f:
adds w0, w0, w1
cset w1, vs
str w1, [x2]
ret
Tested on aarch64-linux-gnu with no regressions. Okay for trunk?
2016-11-30 Michael Collison <[email protected]>
Richard Henderson <[email protected]>
* config/aarch64/aarch64-modes.def (CC_V): New.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
for signed overflow using CC_Vmode.
(aarch64_get_condition_code_1): Handle CC_Vmode.
* config/aarch64/aarch64.md (addv<GPI>4, uaddv<GPI>4): New.
(addti3): Create simpler code if low part is already known to be 0.
(addvti4, uaddvti4): New.
(*add<GPI>3_compareC_cconly_imm): New.
(*add<GPI>3_compareC_cconly): New.
(*add<GPI>3_compareC_imm): New.
(*add<GPI>3_compareC): Rename from add<GPI>3_compare1; do not
handle constants within this pattern.
(*add<GPI>3_compareV_cconly_imm): New.
(*add<GPI>3_compareV_cconly): New.
(*add<GPI>3_compareV_imm): New.
(add<GPI>3_compareV): New.
(add<GPI>3_carryinC, add<GPI>3_carryinV): New.
(*add<GPI>3_carryinC_zero, *add<GPI>3_carryinV_zero): New.
(*add<GPI>3_carryinC, *add<GPI>3_carryinV): New.
(subv<GPI>4, usubv<GPI>4): New.
(subti): Handle op1 zero.
(subvti4, usub4ti4): New.
(*sub<GPI>3_compare1_imm): New.
(sub<GPI>3_carryinCV): New.
(*sub<GPI>3_carryinCV_z1_z2, *sub<GPI>3_carryinCV_z1): New.
(*sub<GPI>3_carryinCV_z2, *sub<GPI>3_carryinCV): New
rth_overflow_ipreview1.patch
Description: rth_overflow_ipreview1.patch
