Ping. Richard, Marcus, do you have any feedback on this?
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00503.html Thanks, Kyrill On 14/06/16 10:38, James Greenhalgh wrote:
On Tue, Jun 07, 2016 at 05:56:51PM +0100, Kyrill Tkachov wrote:Hi all, This is the second part of James's patch from: https://gcc.gnu.org/ml/gcc-patches/2013-09/msg01068.html separated out. It reimplements the vcopyq_lane* intrinsics in C and adds implementations of the other missing vcopy<q>_lane_<q> intrinsics. The differences from that patch are in the use of __aarch64_vset_lane_any and __aarch64_vget_lane_any rather than the typed variants of these that were used back in 2013 (and don't exist anymore). The testcase is also adjusted for the ABI change in GCC 5 where integer x1 vectors are now passed and returned in SIMD registers. The vcopy_laneq_f64 test in the testcase is currently XFAILed because it currently doesn't generate the optimal DUP instruction but instead emits a UMOV to a scalar register and then an FMOV. This is a GCC 7 regression tracked by PR 71307 and I think unrelated to this patch. Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on aarch64_be-none-elf. Ok for trunk?Again, this looks OK to me, but as it is based on my code I can't approve it within the spirit of the write access policies. Please wait for Marcus or Richard to take a look. Thanks, JamesThanks, Kyrill 2016-06-07 Kyrylo Tkachov <kyrylo.tkac...@arm.com> James Greenhalgh <james.greenha...@arm.com> * config/aarch64/arm_neon.h (vcopyq_lane_f32, vcopyq_lane_f64, vcopyq_lane_p8, vcopyq_lane_p16, vcopyq_lane_s8, vcopyq_lane_s16, vcopyq_lane_s32, vcopyq_lane_s64, vcopyq_lane_u8, vcopyq_lane_u16, vcopyq_lane_u32, vcopyq_lane_u64): Reimplement in C. (vcopy_lane_f32, vcopy_lane_f64, vcopy_lane_p8, vcopy_lane_p16, vcopy_lane_s8, vcopy_lane_s16, vcopy_lane_s32, vcopy_lane_s64, vcopy_lane_u8, vcopy_lane_u16, vcopy_lane_u32, vcopy_lane_u64, vcopy_laneq_f32, vcopy_laneq_f64, vcopy_laneq_p8, vcopy_laneq_p16, vcopy_laneq_s8, vcopy_laneq_s16, vcopy_laneq_s32, vcopy_laneq_s64, vcopy_laneq_u8, vcopy_laneq_u16, vcopy_laneq_u32, vcopy_laneq_u64, vcopyq_laneq_f32, vcopyq_laneq_f64, vcopyq_laneq_p8, vcopyq_laneq_p16, vcopyq_laneq_s8, vcopyq_laneq_s16, vcopyq_laneq_s32, vcopyq_laneq_s64, vcopyq_laneq_u8, vcopyq_laneq_u16, vcopyq_laneq_u32, vcopyq_laneq_u64): New intrinsics. 2016-06-07 Kyrylo Tkachov <kyrylo.tkac...@arm.com> James Greenhalgh <james.greenha...@arm.com> * gcc.target/aarch64/vect_copy_lane_1.c: New test.