https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117048

--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkac...@gcc.gnu.org>:

https://gcc.gnu.org/g:1c46a541c6957e8b0eee339d4cff46e951a5ad4e

commit r15-4889-g1c46a541c6957e8b0eee339d4cff46e951a5ad4e
Author: Kyrylo Tkachov <ktkac...@nvidia.com>
Date:   Mon Nov 4 07:25:16 2024 -0800

    PR 117048: simplify-rtx: Simplify (X << C1) [+,^] (X >> C2) into ROTATE

    This is, in effect, a reapplication of
de2bc6a7367aca2eecc925ebb64cfb86998d89f3
    fixing the compile-time hog in var-tracking due to calling simplify_rtx
    on the two arms of the rotation before detecting the ROTATE.
    That is not necessary.

    simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when
    C1 + C2 == mode-width.  But the transformation is also valid for PLUS and
XOR.
    Indeed GIMPLE can also do the fold.  Let's teach RTL to do it too.

    The motivating testcase for this is in AArch64 intrinsics:

    uint64x2_t G2(uint64x2_t a, uint64x2_t b) {
        uint64x2_t c = veorq_u64(a, b);
        return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
    }

    which I was hoping to fold to a single XAR (a ROTATE+XOR instruction) but
    GCC was failing to detect the rotate operation for two reasons:
    1) The combination of the two arms of the expression is done under XOR
rather
    than IOR that simplify-rtx currently supports.
    2) The ASHIFT operation is actually a (PLUS X X) operation and thus is not
    detected as the LHS of the two arms we require.

    The patch fixes both issues.  The analysis of the two arms of the rotation
    expression is factored out into a common helper simplify_rotate_op which is
    then used in the PLUS, XOR, IOR cases in simplify_binary_operation_1.

    The check-assembly testcase for this is added in the following patch
because
    it needs some extra AArch64 backend work, but I've added self-tests in this
    patch to validate the transformation.

    Bootstrapped and tested on aarch64-none-linux-gnu

    Signed-off-by: Kyrylo Tkachov <ktac...@nvidia.com>

            PR target/117048
            * simplify-rtx.cc (extract_ashift_operands_p): Define.
            (simplify_rotate_op): Likewise.
            (simplify_context::simplify_binary_operation_1): Use the above in
            the PLUS, IOR, XOR cases.
            (test_vector_rotate): Define.
            (test_vector_ops): Use the above.

Reply via email to