[PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

Kyrylo Tkachov Tue, 22 Oct 2024 13:26:39 -0700

Hi all,

Some vector rotate operations can be implemented in a single instruction
rather than using the fallback SHL+USRA sequence.
In particular, when the rotate amount is half the bitwidth of the element
we can use a REV64,REV32,REV16 instruction.
This patch adds this transformation in the recently added splitter for vector
rotates.
Bootstrapped and tested on aarch64-none-linux-gnu.


Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com>

gcc/

        * config/aarch64/aarch64-protos.h (aarch64_emit_opt_vec_rotate):
        Declare prototype.
        * config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Implement.
        * config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm<mode>):
        Call the above.

gcc/testsuite/

        * gcc.target/aarch64/simd/pr117048_2.c: New test.

v2-0004-aarch64-Optimize-vector-rotates-into-REV-instruction.patch
Description: v2-0004-aarch64-Optimize-vector-rotates-into-REV-instruction.patch

[PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

Reply via email to