Hi! The following testcase ICEs on ARM, because the backend creates CONST_INTs that aren't valid for SImode, in which they are used (0x80000000 rather than the canonical -0x80000000). This is fixed by the 3 gen_int_mode calls instead of just GEN_INT. Once that is fixed, we ICE because if both the CONST_INTs are -0x80000000, then INTVAL (operands[2]) == -INTVAL (operands[3]) is no longer true and thus cmpsi2_addneg doesn't match and no other pattern does. Fixing that is pretty obvious thing to do as well, similarly to the recent *subsi3_carryin_compare_const fix.
The next problem is that the result doesn't bootstrap. The problem is that the instruction emitted for that -0x80000000 immediate - adds reg, reg, #-0x80000000 actually doesn't do what the RTL pattern for it says. The pattern is like subsi3_compare, except that the MINUS with constant second argument is canonicalized as RTL normally does into PLUS of the negated constant. The mode on the condition code register is CCmode, so all Z, N, C and V bits should be valid, and the pattern is like that of a cmp instruction, so the behavior vs. condition codes should be like cmp instruction, which is what the subs instruction does. The adds r1, r2, #N and subs r1, r2, #-N instructions actually behave the same most of the time. The result is the same, Z and N flags are always the same as well. And, it seems that for all N except for 0 and 0x80000000 also the C and V bits are the same (I've proved that by using the valgrind subs condition code algorithm (which is the same as cmp) vs. valgrind adds condition code algorithm (which is the same as cmn) and even emulated it using 8-bit x 8-bit exhaustive check). For 0 and 0x80000000 it is different and as can be seen by the bootstrap failure, it is significant. Previously before the above change we got by by the pattern never triggering by the combiner for 0x80000000 (because the combiner would always try canonical CONST_INTs for both and those don't have INTVAL == -INTVAL), but it worked for the *compare_scc splitter/*negscc/*thumb2_negscc patterns (which created invalid RTL and used adds rather than subs, but didn't care, as it only tested the Z bit using ne condition). My next attempt was just to switch the two alternatives, so that if operands[2] matches both "I" and "L" constraints and we can choose, we'd use subs. That cured the bootstrap failure, but introduced +FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54 +FAIL: gcc.target/arm/thumb2-cmpneg2add-2.c scan-assembler adds regressions, in those 2 testcases neither the problematic 0 nor INT_MIN is used, so both adds and subs are equivalent, but - adds r2, r4, #1 + subs r2, r4, #-1 results in larger code. Note, I've seen the patch creating smaller code in some cases too, when the combiner matched the cmpsi2_addneg pattern with INT_MIN and previously emitted loading the constant into some register and performing subs with 3 registers instead of 2 and immediate. So, this is another alternative I'm proposing, just make sure we use subs for #0 and #-2147483648 (where it is essential for correctness) and for others keep previous behavior. Ok for trunk? Or is there an easy way to estimate if a constant satisfies both "I" and "L" constraints at the same time and is not one of 0 and INT_MIN, which of the two (positive or negated) would have smaller encoding and decide based on that? 2019-02-28 Jakub Jelinek <ja...@redhat.com> PR target/89506 * config/arm/arm.md (cmpsi2_addneg): Use trunc_int_for_mode (-INTVAL (...), SImode) instead of -INTVAL (...). If operands[2] is 0 or INT_MIN, force use of subs. (*compare_scc splitter): Use gen_int_mode. (*negscc): Likewise. * config/arm/thumb2.md (*thumb2_negscc): Likewise. * gcc.dg/pr89506.c: New test. --- gcc/config/arm/arm.md.jj 2019-02-28 14:13:08.368267536 +0100 +++ gcc/config/arm/arm.md 2019-02-28 22:09:03.089588596 +0100 @@ -867,10 +867,21 @@ (define_insn "cmpsi2_addneg" (set (match_operand:SI 0 "s_register_operand" "=r,r") (plus:SI (match_dup 1) (match_operand:SI 3 "arm_addimm_operand" "I,L")))] - "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])" - "@ - adds%?\\t%0, %1, %3 - subs%?\\t%0, %1, #%n3" + "TARGET_32BIT + && (INTVAL (operands[2]) + == trunc_int_for_mode (-INTVAL (operands[3]), SImode))" +{ + /* For 0 and INT_MIN it is essential that we use subs, as adds + will result in different condition codes (like cmn rather than + like cmp). For other immediates, we should choose whatever + will have smaller encoding. */ + if (operands[2] == const0_rtx + || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000) + || which_alternative == 1) + return "subs%?\\t%0, %1, #%n3"; + else + return "adds%?\\t%0, %1, %3"; +} [(set_attr "conds" "set") (set_attr "type" "alus_sreg")] ) @@ -9302,7 +9313,7 @@ (define_split (cond_exec (ne:CC (reg:CC CC_REGNUM) (const_int 0)) (set (match_dup 0) (const_int 1)))] { - operands[3] = GEN_INT (-INTVAL (operands[2])); + operands[3] = gen_int_mode (-INTVAL (operands[2]), SImode); }) (define_split @@ -10082,7 +10093,8 @@ (define_insn_and_split "*negscc" /* Emit subs\\t%0, %1, %2\;mvnne\\t%0, #0 */ if (CONST_INT_P (operands[2])) emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2], - GEN_INT (- INTVAL (operands[2])))); + gen_int_mode (-INTVAL (operands[2]), + SImode))); else emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2])); --- gcc/config/arm/thumb2.md.jj 2019-02-28 08:14:55.970600405 +0100 +++ gcc/config/arm/thumb2.md 2019-02-28 21:51:20.387902673 +0100 @@ -913,7 +913,8 @@ (define_insn_and_split "*thumb2_negscc" /* Emit subs\\t%0, %1, %2\;it\\tne\;mvnne\\t%0, #0 */ if (CONST_INT_P (operands[2])) emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2], - GEN_INT (- INTVAL (operands[2])))); + gen_int_mode (-INTVAL (operands[2]), + SImode))); else emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2])); --- gcc/testsuite/gcc.dg/pr89506.c.jj 2019-02-28 21:51:20.387902673 +0100 +++ gcc/testsuite/gcc.dg/pr89506.c 2019-02-28 21:51:20.387902673 +0100 @@ -0,0 +1,14 @@ +/* PR target/89506 */ +/* { dg-do compile } */ +/* { dg-options "-Og -g -w" } */ + +long long a; +int c; + +int +foo (long long d, short e) +{ + __builtin_sub_overflow (0xffffffff, c, &a); + e >>= ~2147483647 != (int) a; + return d + e; +} Jakub