https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119428
--- Comment #13 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <ja...@gcc.gnu.org>:

https://gcc.gnu.org/g:584b346a4c7a6e6e77da6dc80968401a3c08161d

commit r15-8896-g584b346a4c7a6e6e77da6dc80968401a3c08161d
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Tue Mar 25 16:55:24 2025 +0100

    i386: Fix up combination of -2 r<<= (x & 7) into btr [PR119428]

    The following patch is miscompiled from r15-8478 but latently already
    since my r11-5756 and r11-6631 changes.
    The r11-5756 change was
    https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561164.html
    which changed the splitters to immediately throw away the masking.
    And the r11-6631 change was an optimization to recognize
    (set (zero_extract:HI (...) (const_int 1) (...)) (const_int 1)
    as btr.

    The problem is their interaction.  x86 is not a SHIFT_COUNT_TRUNCATED
    target, so the masking needs to be explicit in the IL.
    And combine.cc (make_field_assignment) has since 1992 optimizations
    which try to optimize x &= (-2 r<< y) into zero_extract (x) = 0.
    Now, such an optimization is fine if y has not been masked or if the
    chosen zero_extract has the same mode as the rotate (or it recognizes
    something with a left shift too).  IMHO such optimization is invalid
    for SHIFT_COUNT_TRUNCATED targets because we explicitly say that
    the masking of the shift/rotate counts are redundant there and don't
    need to be part of the IL (I have a patch for that, but because it
    is just latent, I'm not sure it needs to be posted for gcc 15 (and
    also am not sure if it should punt or add operand masking just in case)).
    x86 is not SHIFT_COUNT_TRUNCATED though and so even fixing combine
    not to do that for SHIFT_COUNT_TRUNCATED targets doesn't help, and we don't
    have QImode insv, so it is optimized into HImode insertions.  Now,
    if the y in x &= (-2 r<< y) wasn't masked in any way, turning it into
    HImode btr is just fine, but if it was x &= (-2 r<< (y & 7)) and we just
    decided to throw away the masking, using btr changes the behavior on it
    and causes e2fsprogs and sqlite miscompilations.

    So IMHO on !SHIFT_COUNT_TRUNCATED targets, we need to keep the maskings
    explicit in the IL, either at least for the duration of the combine pass
    as does the following patch (where combine is the only known pass to have
    such transformation), or even keep it until final pass in case there are
    some later optimizations that would also need to know whether there was
    explicit masking or not and with what mask.  The latter change would be
    much larger.

    The following patch just reverts the r11-5756 change and adds a testcase.

    2025-03-25  Jakub Jelinek  <ja...@redhat.com>

            PR target/96226
            PR target/119428
            * config/i386/i386.md (splitter after *<rotate_insn><mode>3_mask,
            splitter after *<rotate_insn><mode>3_mask_1): Revert 2020-12-05
            changes.

            * gcc.c-torture/execute/pr119428.c: New test.

Reply via email to