On 14 November 2014 10:46, Alan Lawrence <alan.lawre...@arm.com> wrote: > This patch replaces the inline asm for vld1_dup intrinsics with a vdup_n_ > and a load from the pointer. The existing *aarch64_simd_ld1r<mode> insn, > combiner, etc., are quite capable of generating the expected single ld1r > instruction from this. (I've verified by inspecting assembler output.) > > gcc/ChangeLog: > > * config/aarch64/arm_neon.h (vld1_dup_f32, vld1_dup_f64, > vld1_dup_p8, > vld1_dup_p16, vld1_dup_s8, vld1_dup_s16, vld1_dup_s32, vld1_dup_s64, > vld1_dup_u8, vld1_dup_u16, vld1_dup_u32, vld1_dup_u64, > vld1q_dup_f32, > vld1q_dup_f64, vld1q_dup_p8, vld1q_dup_p16, vld1q_dup_s8, > vld1q_dup_s16, > vld1q_dup_s32, vld1q_dup_s64, vld1q_dup_u8, vld1q_dup_u16, > vld1q_dup_u32, vld1q_dup_u64): Replace inline asm with vdup_n_ and > pointer dereference.
OK /Marcus