Hi Jeff, Many thanks for the review/approval of my fix for PR rtl-optimization/91865. Based on your and Richard Biener's feedback, I’d like to propose a revision calling simplify_unary_operation instead of simplify_const_unary_operation (i.e. Richi's recommendation). I was originally concerned that this might potentially result in unbounded recursion, and testing for ZERO_EXTEND was safer but "uglier", but testing hasn't shown any issues. If we do see issues in the future, it's easy to fall back to the previous version of this patch.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-10-25 Roger Sayle <ro...@nextmovesoftware.com> Richard Biener <rguent...@suse.de> gcc/ChangeLog PR rtl-optimization/91865 * combine.cc (make_compound_operation): Avoid creating a ZERO_EXTEND of a ZERO_EXTEND. gcc/testsuite/ChangeLog PR rtl-optimization/91865 * gcc.target/msp430/pr91865.c: New test case. Thanks again, Roger -- > -----Original Message----- > From: Jeff Law <jeffreya...@gmail.com> > Sent: 19 October 2023 16:20 > > On 10/14/23 16:14, Roger Sayle wrote: > > > > This patch is my proposed solution to PR rtl-optimization/91865. > > Normally RTX simplification canonicalizes a ZERO_EXTEND of a > > ZERO_EXTEND to a single ZERO_EXTEND, but as shown in this PR it is > > possible for combine's make_compound_operation to unintentionally > > generate a non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is > > unlikely to be matched by the backend. > > > > For the new test case: > > > > const int table[2] = {1, 2}; > > int foo (char i) { return table[i]; } > > > > compiling with -O2 -mlarge on msp430 we currently see: > > > > Trying 2 -> 7: > > 2: r25:HI=zero_extend(R12:QI) > > REG_DEAD R12:QI > > 7: r28:PSI=sign_extend(r25:HI)#0 > > REG_DEAD r25:HI > > Failed to match this instruction: > > (set (reg:PSI 28 [ iD.1772 ]) > > (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ])))) > > > > which results in the following code: > > > > foo: AND #0xff, R12 > > RLAM.A #4, R12 { RRAM.A #4, R12 > > RLAM.A #1, R12 > > MOVX.W table(R12), R12 > > RETA > > > > With this patch, we now see: > > > > Trying 2 -> 7: > > 2: r25:HI=zero_extend(R12:QI) > > REG_DEAD R12:QI > > 7: r28:PSI=sign_extend(r25:HI)#0 > > REG_DEAD r25:HI > > Successfully matched this instruction: > > (set (reg:PSI 28 [ iD.1772 ]) > > (zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ]))) allowing > > combination of insns 2 and 7 original costs 4 + 8 = 12 replacement > > cost 8 > > > > foo: MOV.B R12, R12 > > RLAM.A #1, R12 > > MOVX.W table(R12), R12 > > RETA > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32} > > with no new failures. Ok for mainline? > > > > 2023-10-14 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR rtl-optimization/91865 > > * combine.cc (make_compound_operation): Avoid creating a > > ZERO_EXTEND of a ZERO_EXTEND. > Final question. Is there a reasonable expectation that we could get a > similar situation with sign extensions? If so we probably ought to try > and handle both. > > OK with the obvious change to handle nested sign extensions if you think it's > useful to do so. And OK as-is if you don't think handling nested sign > extensions is > useful. > > jeff
diff --git a/gcc/combine.cc b/gcc/combine.cc index 360aa2f25e6..b1b16ac7bb2 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -8449,8 +8449,8 @@ make_compound_operation (rtx x, enum rtx_code in_code) if (code == ZERO_EXTEND) { new_rtx = make_compound_operation (XEXP (x, 0), next_code); - tem = simplify_const_unary_operation (ZERO_EXTEND, GET_MODE (x), - new_rtx, GET_MODE (XEXP (x, 0))); + tem = simplify_unary_operation (ZERO_EXTEND, GET_MODE (x), + new_rtx, GET_MODE (XEXP (x, 0))); if (tem) return tem; SUBST (XEXP (x, 0), new_rtx); diff --git a/gcc/testsuite/gcc.target/msp430/pr91865.c b/gcc/testsuite/gcc.target/msp430/pr91865.c new file mode 100644 index 00000000000..8cc21c8b9e8 --- /dev/null +++ b/gcc/testsuite/gcc.target/msp430/pr91865.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mlarge" } */ + +const int table[2] = {1, 2}; +int foo (char i) { return table[i]; } + +/* { dg-final { scan-assembler-not "AND" } } */ +/* { dg-final { scan-assembler-not "RRAM" } } */