On Thu, Nov 14, 2024 at 7:50 PM Soumya AR <soum...@nvidia.com> wrote: > > The SVE SUBR instruction performs a reversed subtract from an immediate. > > This patches enables the emission of SUBR for Neon modes and avoids the need > to > materialise an explicit constant. > > For example, the below test case: > > typedef long long __attribute__ ((vector_size (16))) v2di; > > v2di subr_v2di (v2di x) > { > return 15 - x; > } > > compiles to: > > subr_v2di: > mov z31.d, #15 > sub v0.2d, v31.2d, v0.2d > ret > > but can just be: > > subr_v2di: > subr z0.d, z0.d, #15 > ret > > The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. > OK for mainline? > > Signed-off-by: Soumya AR <soum...@nvidia.com> > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md: > (sub<mode>3<vczle><vczbe>): Extended the pattern to emit SUBR for SVE > targets if operand 1 is an immediate. > * config/aarch64/predicates.md (aarch64_sve_arith_imm_or_reg_operand): > New predicate that accepts aarch64_sve_arith_immediate in operand 1 > but > only for TARGET_SVE.
I think this might cause wrong code with: ``` #include <arm_neon.h> uint32x4_t foo_sub_u32 (uint32x2_t a, uint32x2_t b) { uint32x2_t zeros = vcreate_u32 (0); b = vdup_n_u32 (15); return vcombine_u32 (vsub_u32 (b, a), zeros); } ``` As now the elements that are supposed to be zero are now `15-x`. This is due to the `<vczle><vczbe>` part of the pattern name. Thanks, Andrew > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/sve/subr-sve.c: New test. >