Hi,

As subject, this patch rewrites the vcvtx Neon intrinsics to use RTL builtins
rather than inline assembly code, allowing for better scheduling and
optimization.

Regression tested and bootstrapped on aarch64-none-linux-gnu and
aarch64_be-none-elf - no issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-02-18  Jonathan Wright  <jonathan.wri...@arm.com>

        * config/aarch64/aarch64-simd-builtins.def: Add
        float_trunc_rodd builtin generator macros.
        * config/aarch64/aarch64-simd.md (aarch64_float_trunc_rodd_df):
        Define.
        (aarch64_float_trunc_rodd_lo_v2sf): Define.
        (aarch64_float_trunc_rodd_hi_v4sf_le): Define.
        (aarch64_float_trunc_rodd_hi_v4sf_be): Define.
        (aarch64_float_trunc_rodd_hi_v4sf): Define.
        * config/aarch64/arm_neon.h (vcvtx_f32_f64): Use RTL builtin
        instead of inline asm.
        (vcvtx_high_f32_f64): Likewise.
        (vcvtxd_f32_f64): Likewise.
        * config/aarch64/iterators.md: Add FCVTXN unspec.

Attachment: rb14222.patch
Description: rb14222.patch

Reply via email to