Hi, As subject, this patch rewrites the floating-point vml[as][q]_n Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization.
Regression tested and bootstrapped on aarch64-none-linux-gnu - no issues. Ok for master? Thanks, Jonathan --- gcc/ChangeLog: 2021-01-18 Jonathan Wright <jonathan.wri...@arm.com> * config/aarch64/aarch64-simd-builtins.def: Add float_ml[as]_n builtin generator macros. * config/aarch64/aarch64-simd.md (mul_n<mode>3): Define. (aarch64_float_mla_n<mode>): Define. (aarch64_float_mls_n<mode>): Define. * config/aarch64/arm_neon.h (vmla_n_f32): Use RTL builtin instead of inline asm. (vmlaq_n_f32): Likewise. (vmls_n_f32): Likewise. (vmlsq_n_f32): Likewise.
rb14042.patch
Description: rb14042.patch