Hello, like analyzed in the PR, combine is able to remove outer subregs that don't do anything interesting in the context they are used (simplify_comparison). But that currently happens outside of the loop that retries simplifications if changes occurred.
When we do that inside the loop as well we get secondary simplifications that currently only happen when calling the simplifiers multiple time, like when we start from three rather than from two instructions. So right now we're in the curious position that more complicated code is optimized better than simpler code and the patch fixes this. (FWIW: this replicates parts of rather than moves the responsible code, because between the loop and the original place of simplification other things happen that might itself generate subregs). Regstrapping on x86-64, all languages in process. Okay if that passes? Ciao, Michael. PR missed-optimization/77881 * combine.c (simplify_comparison): Remove useless subregs also inside the loop, not just after it. testsuite/ * gcc.target/i386/pr77881.c: New test. diff --git a/gcc/combine.c b/gcc/combine.c index 2727683..58351ff 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -11925,6 +11925,28 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1) if (subreg_lowpart_p (op0) && GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) < mode_width) ; + else if (subreg_lowpart_p (op0) + && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT + && GET_MODE_CLASS (GET_MODE (SUBREG_REG (op0))) == MODE_INT + && (code == NE || code == EQ) + && (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) + <= HOST_BITS_PER_WIDE_INT) + && !paradoxical_subreg_p (op0) + && (nonzero_bits (SUBREG_REG (op0), + GET_MODE (SUBREG_REG (op0))) + & ~GET_MODE_MASK (GET_MODE (op0))) == 0) + { + /* Remove outer subregs that don't do anything. */ + tem = gen_lowpart (GET_MODE (SUBREG_REG (op0)), op1); + + if ((nonzero_bits (tem, GET_MODE (SUBREG_REG (op0))) + & ~GET_MODE_MASK (GET_MODE (op0))) == 0) + { + op0 = SUBREG_REG (op0), op1 = tem; + continue; + } + break; + } else break; diff --git a/gcc/testsuite/gcc.target/i386/pr77881.c b/gcc/testsuite/gcc.target/i386/pr77881.c new file mode 100644 index 0000000..80d143f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr77881.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ +/* { dg-options "-O2" } */ +extern void baz(void); +int +foo (long long int a, long long int a2, int b) +{ + if (a < 0 || b) + baz (); +} +/* { dg-final { scan-assembler "js\[ \t\]\.L" } } */ +/* { dg-final { scan-assembler "jne\[ \t\]\.L" } } */