https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120233

--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <[email protected]>:

https://gcc.gnu.org/g:4b71cafc8447e09ee41aff02acb5b26e8b112466

commit r16-7547-g4b71cafc8447e09ee41aff02acb5b26e8b112466
Author: Jakub Jelinek <[email protected]>
Date:   Tue Feb 17 11:43:43 2026 +0100

    bswap: Handle VEC_PACK_TRUNC_EXPR [PR120233]

    With r16-531 we've regressed
    FAIL: gcc.target/i386/pr108938-3.c scan-assembler-times bswap[\t ]+ 3
    on ia32 and also made 2 separate regressions in the same testcase
    on x86_64-linux (which in scan-assembler-times cancel out; previously
    we were generating one 64-bit bswap + rotate in one function and
    one 32-bit bswap + rotate in another function, now we emit 2 32-bit bswaps
    in the first one and really horrible code in the second one).

    The following patch fixes the latter function by emitting 32-bit bswap
    + 32-bit rotate on both ia32 and x86_64.  This fixes the
    above FAIL (and introduces
    FAIL: gcc.target/i386/pr108938-3.c scan-assembler-times bswap[\t ]+ 2
    on x86_64).

    The problem is that the vectorizer now uses VEC_PACK_TRUNC_EXPR and
    bswap/store_merging was only able to handle vectors in a CONSTRUCTOR.

    The patch adds handling of VEC_PACK_TRUNC_EXPR if its operands are
    CONSTRUCTORs.  Without a testcase, I wasn't confident enough to write
    BYTES_BIG_ENDIAN support, for CONSTRUCTOR { A, B, C, D } we make
    DCBA out of it on little endian but ABCD on big endian and that would
    need to be combined with picking up the most significant halves of each
    element.

    I'll look incrementally at the other function.

    2026-02-17  Jakub Jelinek  <[email protected]>

            PR target/120233
            * gimple-ssa-store-merging.cc (find_bswap_or_nop_2): New function.
            (find_bswap_or_nop): Move CONSTRUCTOR handling to above function,
            call it instead of find_bswap_or_nop_1.
            (bswap_replace): Handle VEC_PACK_TRUNC_EXPR like CONSTRUCTOR.
            (maybe_optimize_vector_constructor): Likewise.
            (pass_optimize_bswap::execute): Likewise.
            (get_status_for_store_merging): Likewise.
            (pass_store_merging::execute): Likewise.

Reply via email to