On 02/25/2016 02:51 AM, Kyrill Tkachov wrote:
Hi Michael,
On 24/02/16 23:02, Michael Collison wrote:
This patch adds support for builtin overflow of add, subtract and
negate. This patch is targeted for gcc 7 stage 1. It was tested with
no regressions in arm and thumb modes on the following targets:
arm-non-linux-gnueabi
arm-non-linux-gnuabihf
armeb-none-linux-gnuabihf
arm-non-eabi
I'll have a deeper look once we're closer to GCC 7 development.
I've got a few comments in the meantime.
2016-02-24 Michael Collison <michael.colli...@arm.com>
* config/arm/arm-modes.def: Add new condition code mode CC_V
to represent the overflow bit.
* config/arm/arm.c (maybe_get_arm_condition_code):
Add support for CC_Vmode.
* config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
addsi3_compareV_upper): New patterns to support signed
builtin overflow add operations.
(uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
New patterns to support unsigned builtin add overflow operations.
(subv<mode>4, sub<mode>3_compare1): New patterns to support signed
builtin overflow subtract operations,
(usubv<mode>4): New patterns to support unsigned builtin subtract
overflow operations.
(negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New
patterns
to support builtin overflow negate operations.
Can you please summarise what sequences are generated for these
operations, and how
they are better than the default fallback sequences.
Sure for a simple test case such as:
int
fn3 (int x, int y, int *ovf)
{
int res;
*ovf = __builtin_sadd_overflow (x, y, &res);
return res;
}
Current trunk at -O2 generates
fn3:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r1, #0
mov r3, #0
add r1, r0, r1
blt .L4
cmp r1, r0
blt .L3
.L2:
str r3, [r2]
mov r0, r1
bx lr
.L4:
cmp r1, r0
ble .L2
.L3:
mov r3, #1
b .L2
With the overflow patch this now generates:
adds r0, r0, r1
movvs r3, #1
movvc r3, #0
str r3, [r2]
bx lr
Also, we'd need tests for each of these overflow operations, since
these are pretty complex
patterns that are being added.
The patterns are tested now most notably by tests in:
c-c++-common/torture/builtin-arith-overflow*.c
I had a few failures I resolved so the builtin overflow arithmetic
functions are definitely being exercised.
Also, you may want to consider splitting this into a patch series,
each adding a single
overflow operation, together with its tests. That way it will be
easier to keep track of
which pattern applies to which use case and they can go in
independently of each other.
Let me know if you still fell the same way given the existing test cases.
+(define_expand "uaddv<mode>4"
+ [(match_operand:SIDI 0 "register_operand")
+ (match_operand:SIDI 1 "register_operand")
+ (match_operand:SIDI 2 "register_operand")
+ (match_operand 3 "")]
+ "TARGET_ARM"
+{
+ emit_insn (gen_add<mode>3_compareC (operands[0], operands[1],
operands[2]));
+
+ rtx x;
+ x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM),
const0_rtx);
+ x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+ gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+ pc_rtx);
+ emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+ DONE;
+})
+
I notice this and many other patterns in this patch are guarded on
TARGET_ARM. Is there any reason why they
should be restricted to arm state and not be TARGET_32BIT ?
I thought about this as well. I will test will TARGET_32BIT and get back
to you.
Thanks,
Kyrill
--
Michael Collison
Linaro Toolchain Working Group
michael.colli...@linaro.org