While looking into PR 112454, I found the cost for `(if_then_else (cmp) (const_int 1) (reg))` was being recorded as 8 (or `COSTS_N_INSNS (2)`) but it should have been 4 (or `COSTS_N_INSNS (1)`). This improves the cost by not adding the cost of `(const_int 1)` to the total cost.
It does not does not fully fix PR 112454 as that requires other changes to forwprop the `(const_int 1)` earlier than combine. Though we do fix the loop case where the constant was only used once. Committed as approved after bootstrapped and tested on aarch64-linux-gnu with no regressions. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_if_then_else_costs): Handle csinv/csinc case of 1/-1. gcc/testsuite/ChangeLog: * gcc.target/aarch64/csinc-3.c: New test. Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com> --- gcc/config/aarch64/aarch64.cc | 12 ++++++++++++ gcc/testsuite/gcc.target/aarch64/csinc-3.c | 10 ++++++++++ 2 files changed, 22 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/csinc-3.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index b2093430937..4fd8c2de43a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -11607,6 +11607,18 @@ aarch64_if_then_else_costs (rtx op0, rtx op1, rtx op2, int *cost, bool speed) /* CSINV/NEG with zero extend + const 0 (*csinv3_uxtw_insn3). */ op1 = XEXP (inner, 0); } + else if (op1 == constm1_rtx || op1 == const1_rtx) + { + /* Use CSINV or CSINC. */ + *cost += rtx_cost (op2, VOIDmode, IF_THEN_ELSE, 2, speed); + return true; + } + else if (op2 == constm1_rtx || op2 == const1_rtx) + { + /* Use CSINV or CSINC. */ + *cost += rtx_cost (op1, VOIDmode, IF_THEN_ELSE, 1, speed); + return true; + } *cost += rtx_cost (op1, VOIDmode, IF_THEN_ELSE, 1, speed); *cost += rtx_cost (op2, VOIDmode, IF_THEN_ELSE, 2, speed); diff --git a/gcc/testsuite/gcc.target/aarch64/csinc-3.c b/gcc/testsuite/gcc.target/aarch64/csinc-3.c new file mode 100644 index 00000000000..bde131a584e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/csinc-3.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-vectorize" } */ + +int f(int *a, int n, int *b, int d) +{ + for(int i = 0; i < n; i++) + b[i] = a[i] == 100 ? 1 : d; + /* { dg-final { scan-assembler "csinc\tw\[0-9\].*wzr" } } */ + return 0; +} -- 2.34.1