https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117048

--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkac...@gcc.gnu.org>:

https://gcc.gnu.org/g:de2bc6a7367aca2eecc925ebb64cfb86998d89f3

commit r15-4873-gde2bc6a7367aca2eecc925ebb64cfb86998d89f3
Author: Kyrylo Tkachov <ktkac...@nvidia.com>
Date:   Tue Oct 15 06:32:31 2024 -0700

    PR 117048: simplify-rtx: Simplify (X << C1) [+,^] (X >> C2) into ROTATE

    simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when
    C1 + C2 == mode-width.  But the transformation is also valid for PLUS and
XOR.
    Indeed GIMPLE can also do the fold.  Let's teach RTL to do it too.

    The motivating testcase for this is in AArch64 intrinsics:

    uint64x2_t G2(uint64x2_t a, uint64x2_t b) {
        uint64x2_t c = veorq_u64(a, b);
        return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
    }

    which I was hoping to fold to a single XAR (a ROTATE+XOR instruction) but
    GCC was failing to detect the rotate operation for two reasons:
    1) The combination of the two arms of the expression is done under XOR
rather
    than IOR that simplify-rtx currently supports.
    2) The ASHIFT operation is actually a (PLUS X X) operation and thus is not
    detected as the LHS of the two arms we require.

    The patch fixes both issues.  The analysis of the two arms of the rotation
    expression is factored out into a common helper simplify_rotate which is
    then used in the PLUS, XOR, IOR cases in simplify_binary_operation_1.

    The check-assembly testcase for this is added in the following patch
because
    it needs some extra AArch64 backend work, but I've added self-tests in this
    patch to validate the transformation.

    Bootstrapped and tested on aarch64-none-linux-gnu

    Signed-off-by: Kyrylo Tkachov <ktac...@nvidia.com>

            PR target/117048
            * simplify-rtx.cc (extract_ashift_operands_p): Define.
            (simplify_rotate_op): Likewise.
            (simplify_context::simplify_binary_operation_1): Use the above in
            the PLUS, IOR, XOR cases.
            (test_vector_rotate): Define.
            (test_vector_ops): Use the above.

Reply via email to