Jonathan Wright <jonathan.wri...@arm.com> writes: > Hi, > > As subject, this patch implements saturating right-shift and narrow high > Neon intrinsic RTL patterns using a vec_concat of a register_operand > and a VQSHRN_N unspec - instead of just a VQSHRN_N unspec. This > more relaxed pattern allows for more aggressive combinations and > ultimately better code generation - which will be confirmed by a new > set of tests in gcc.target/aarch64/narrow_high_combine.c (patch 5/5 in > this series.) > > Regression tested and bootstrapped on aarch64-none-linux-gnu - no > issues. > > Ok for master?
OK, thanks. Richard > Thanks, > Jonathan > > --- > > gcc/ChangeLog: > > 2021-03-04 Jonathan Wright <jonathan.wri...@arm.com> > > * config/aarch64/aarch64-simd.md (aarch64_<sur>q<r>shr<u>n2_n<mode>): > Implement as an expand emitting a big/little endian > instruction pattern. > (aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_le): Define. > (aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_be): Define. > > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 69d48fea16b732c20db0ee400782ef9b73982c47..2a836e8f9a4dfe11d645d439b19ac4487d9fb1a8 > 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -6031,17 +6031,54 @@ > [(set_attr "type" "neon_sat_shift_imm_narrow_q")] > ) > > -(define_insn "aarch64_<sur>q<r>shr<u>n2_n<mode>" > +(define_insn "aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_le" > [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") > - (unspec:<VNARROWQ2> [(match_operand:<VNARROWQ> 1 "register_operand" > "0") > - (match_operand:VQN 2 "register_operand" "w") > - (match_operand:SI 3 > "aarch64_simd_shift_imm_offset_<ve_mode>" "i")] > - VQSHRN_N))] > - "TARGET_SIMD" > + (vec_concat:<VNARROWQ2> > + (match_operand:<VNARROWQ> 1 "register_operand" "0") > + (unspec:<VNARROWQ> [(match_operand:VQN 2 "register_operand" "w") > + (match_operand:VQN 3 > + "aarch64_simd_shift_imm_vec_<vn_mode>")] > + VQSHRN_N)))] > + "TARGET_SIMD && !BYTES_BIG_ENDIAN" > "<sur>q<r>shr<u>n2\\t%<vn2>0.<V2ntype>, %<v>2.<Vtype>, %3" > [(set_attr "type" "neon_sat_shift_imm_narrow_q")] > ) > > +(define_insn "aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_be" > + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") > + (vec_concat:<VNARROWQ2> > + (unspec:<VNARROWQ> [(match_operand:VQN 2 "register_operand" "w") > + (match_operand:VQN 3 > + "aarch64_simd_shift_imm_vec_<vn_mode>")] > + VQSHRN_N) > + (match_operand:<VNARROWQ> 1 "register_operand" "0")))] > + "TARGET_SIMD && BYTES_BIG_ENDIAN" > + "<sur>q<r>shr<u>n2\\t%<vn2>0.<V2ntype>, %<v>2.<Vtype>, %3" > + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] > +) > + > +(define_expand "aarch64_<sur>q<r>shr<u>n2_n<mode>" > + [(match_operand:<VNARROWQ2> 0 "register_operand") > + (match_operand:<VNARROWQ> 1 "register_operand") > + (unspec:<VNARROWQ> > + [(match_operand:VQN 2 "register_operand") > + (match_operand:SI 3 "aarch64_simd_shift_imm_offset_<vn_mode>")] > + VQSHRN_N)] > + "TARGET_SIMD" > + { > + operands[3] = aarch64_simd_gen_const_vector_dup (<MODE>mode, > + INTVAL (operands[3])); > + > + if (BYTES_BIG_ENDIAN) > + emit_insn (gen_aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_be (operands[0], > + operands[1], operands[2], operands[3])); > + else > + emit_insn (gen_aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_le (operands[0], > + operands[1], operands[2], operands[3])); > + DONE; > + } > +) > + > > ;; cm(eq|ge|gt|lt|le) > ;; Note, we have constraints for Dz and Z as different expanders