[Bug tree-optimization/96703] Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0

pinskia at gcc dot gnu.org via Gcc-bugs Sun, 03 Sep 2023 21:31:40 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96703


--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm for
```
#define cst 0x1234

bool f(int x, int y)
{
    return x > y && y == cst;
}

bool f0(int x, int y)
{
    return x > cst && y == cst;
}
```

currently for GCC on aarch64:
```
f:
        cmp     w0, w1
        mov     w2, 4660
        ccmp    w1, w2, 0, gt
        cset    w0, eq
        ret
f0:
        mov     w2, 4660
        cmp     w0, w2
        ccmp    w1, w2, 0, gt
        cset    w0, eq
        ret
```
The f is actually better because the first cmp is indepdent from the move.
So for a dual issue CPU, f would be better almost always. Even if the move does
not occupy an issue slot.

For RISCV not doing is actually better:
        li      a5,4096
        addi    a5,a5,564
        sub     a5,a1,a5
        sgt     a0,a0,a1
        seqz    a5,a5
        and     a0,a5,a0
        ret

vs
        li      a5,4096
        addi    a5,a5,564
        sub     a1,a1,a5
        seqz    a1,a1
        sgt     a0,a0,a5
        and     a0,a1,a0
        ret

The sgt without doing this is indepdent of the constant forming.

Now 0 could be handled as a special case because most targets handle 0 nicely.

I see doing it is better for power but I don't know if that is true in general
or just the constants I tried.

[Bug tree-optimization/96703] Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0

Reply via email to