Hi all, The MD pattern for the XAR instruction in SVE2 is currently expressed with non-canonical RTL by using a ROTATERT code with a constant rotate amount. Fix it by using the left ROTATE code. This necessitates splitting out the expander separately to translate the immediate coming from the intrinsic from a right-rotate to a left-rotate immediate.
Additionally, as the SVE2 XAR instruction is unpredicated and can handle all element sizes from .b to .d, it is a good fit for implementing the XOR+ROTATE operation for Advanced SIMD modes where the TARGET_SHA3 cannot be used (that can only handle V2DImode operands). Therefore let's extend the accepted modes of the SVE2 patternt to include the 128-bit Advanced SIMD integer modes. This leads to some tests for the svxar* intrinsics to fail because they now simplify to a plain EOR when the rotate amount is the width of the element. This simplification is desirable (EOR instructions have better or equal throughput than XAR, and they are non-destructive of their input) so the tests are adjusted. For V2DImode XAR operations we should prefer the Advanced SIMD version when it is available (TARGET_SHA3) because it is non-destructive, so restrict the SVE2 pattern accordingly. Tests are added to confirm this. Bootstrapped and tested on aarch64-none-linux-gnu. Ok for mainline? Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com> gcc/ * config/aarch64/iterators.md (SVE_ASIMD_FULL_I): New mode iterator. * config/aarch64/aarch64-sve2.md (@aarch64_sve2_xar<mode>): Rename to... (*aarch64_sve2_xar<mode>_insn): ... This. Use SVE_ASIMD_FULL_I iterator and adjust output logic. (@aarch64_sve2_xar<mode>): New define_expand. gcc/testsuite/ * gcc.target/aarch64/xar_neon_modes.c: New test. * gcc.target/aarch64/xar_v2di_nonsve.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_s16.c: Scan for EOR rather than XAR. * gcc.target/aarch64/sve2/acle/asm/xar_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/xar_u8.c: Likewise.
v2-0002-aarch64-Use-canonical-RTL-representation-for-SVE2-XA.patch
Description: v2-0002-aarch64-Use-canonical-RTL-representation-for-SVE2-XA.patch