Hi all, __builtin_convertvector seems well-suited to implementing the vmovl and vmovn intrinsics that widen and narrow the integer elements in a vector.
This removes some more inline assembly from the intrinsics. Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk. Thanks, Kyrill gcc/ * config/aarch64/arm_neon.h (vmovl_s8): Reimplement using __builtin_convertvector. (vmovl_s16): Likewise. (vmovl_s32): Likewise. (vmovl_u8): Likewise. (vmovl_u16): Likewise. (vmovl_u32): Likewise. (vmovn_s16): Likewise. (vmovn_s32): Likewise. (vmovn_s64): Likewise. (vmovn_u16): Likewise. (vmovn_u32): Likewise. (vmovn_u64): Likewise.
convertvec.patch
Description: convertvec.patch