Hi Jonathan, > -----Original Message----- > From: Jonathan Wright <jonathan.wri...@arm.com> > Sent: 23 July 2021 09:22 > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; Richard Sandiford > <richard.sandif...@arm.com> > Subject: [PATCH 1/8] aarch64: Use memcpy to copy vector tables in > vqtbl[234] intrinsics > > Hi, > > This patch uses __builtin_memcpy to copy vector structures instead of > building a new opaque structure one vector at a time in each of the > vqtbl[234] Neon intrinsics in arm_neon.h. This simplifies the header file > and also improves code generation - superfluous move instructions > were emitted for every register extraction/set in this additional > structure. > > Add new code generation tests to verify that superfluous move > instructions are no longer generated for the vqtbl[234] intrinsics. > > Regression tested and bootstrapped on aarch64-none-linux-gnu - no > issues. > > Ok for master? >
In the testcase: diff --git a/gcc/testsuite/gcc.target/aarch64/vector_structure_intrinsics.c b/gcc/testsuite/gcc.target/aarch64/vector_structure_intrinsics.c new file mode 100644 index 0000000000000000000000000000000000000000..2fab0f2947b7fa28e4e3a77bd365dcfdf30a9b28 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vector_structure_intrinsics.c @@ -0,0 +1,45 @@ +/* { dg-skip-if "" { arm*-*-* } } */ Files in gcc.target/aarch64 won't be attempted on arm* targets so the skip-if isn't needed (that's only for tests in gcc.target/aarch64/advsimd-intrinsics/). Ok with that directive removed, thanks for doing this! Kyrill > Thanks, > Jonathan > > --- > > gcc/ChangeLog: > > 2021-07-08 Jonathan Wright <jonathan.wri...@arm.com> > > * config/aarch64/arm_neon.h (vqtbl2_s8): Use __builtin_memcpy > instead of constructing __builtin_aarch64_simd_oi one vector > at a time. > (vqtbl2_u8): Likewise. > (vqtbl2_p8): Likewise. > (vqtbl2q_s8): Likewise. > (vqtbl2q_u8): Likewise. > (vqtbl2q_p8): Likewise. > (vqtbl3_s8): Use __builtin_memcpy instead of constructing > __builtin_aarch64_simd_ci one vector at a time. > (vqtbl3_u8): Likewise. > (vqtbl3_p8): Likewise. > (vqtbl3q_s8): Likewise. > (vqtbl3q_u8): Likewise. > (vqtbl3q_p8): Likewise. > (vqtbl4_s8): Use __builtin_memcpy instead of constructing > __builtin_aarch64_simd_xi one vector at a time. > (vqtbl4_u8): Likewise. > (vqtbl4_p8): Likewise. > (vqtbl4q_s8): Likewise. > (vqtbl4q_u8): Likewise. > (vqtbl4q_p8): Likewise. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/vector_structure_intrinsics.c: New test.