movzx sequence

andysem at mail dot ru Sat, 29 Aug 2020 14:35:37 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96846


--- Comment #4 from andysem at mail dot ru ---
(In reply to Jakub Jelinek from comment #3)
>         mov     edx, DWORD PTR [rdi]
>         cmp     edx, esi
>         sete    al
>         cmp     edx, r9d
>         sete    dl
>         or      eax, edx
>         movzx   eax, al
> This isn't what the peepholes are looking for, there are several other insns
> in between, and peephole2s only work on exact insn sequences, doing anything
> more complex would require doing it in some machine specific pass.

Yes, I think, this optimization needs to happen at an earlier stage. Rewriting
fixed instruction sequences doesn't allow for further optimizations like
hoisting the xor out of the loop body.

> Note, while in theory it could add xor eax, eax before the cmp edx, esi
> insn, it can't add xor edx, edx because the second comparison uses that
> register.

I don't think it should generate "xor edx, edx". I think, the logic has to be
roughly something like this:

1. Check if there is a spare register that we can use for the test result. If
there is, allocate it.
2. If we have a register, clear it with a xor before the test. Ideally, move
that xor out of the loop.
3. If not, decide if we are going to reuse one of the source registers or spill
some other register.
4. In the former case, keep the test/setcc/movxz sequence. In the latter, we
can still use xor/test/setcc, after spilling the victim register.

I.e. the main point is that it shouldn't try reusing the source register as
much; only reuse when you have to. Maybe, this requires some help from the
register allocator.

I admit, I have little knowledge how gcc internally works, so I may be talking
nonsense. That's just my naive thoughts about it.

[Bug target/96846] [x86] Prefer xor/test/setcc over test/setcc/movzx sequence

Reply via email to