Jonathan Wright <jonathan.wri...@arm.com> writes:
> Hi,
>
> As subject, this patch implements saturating right-shift and narrow high
> Neon intrinsic RTL patterns using a vec_concat of a register_operand
> and a VQSHRN_N unspec - instead of just a VQSHRN_N unspec. This
> more relaxed pattern allows for more aggressive combinations and
> ultimately better code generation - which will be confirmed by a new
> set of tests in gcc.target/aarch64/narrow_high_combine.c (patch 5/5 in
> this series.)
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-03-04  Jonathan Wright  <jonathan.wri...@arm.com>
>
>         * config/aarch64/aarch64-simd.md (aarch64_<sur>q<r>shr<u>n2_n<mode>):
>         Implement as an expand emitting a big/little endian
>         instruction pattern.
>         (aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_le): Define.
>         (aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_be): Define.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 69d48fea16b732c20db0ee400782ef9b73982c47..2a836e8f9a4dfe11d645d439b19ac4487d9fb1a8
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -6031,17 +6031,54 @@
>    [(set_attr "type" "neon_sat_shift_imm_narrow_q")]
>  )
>  
> -(define_insn "aarch64_<sur>q<r>shr<u>n2_n<mode>"
> +(define_insn "aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_le"
>    [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w")
> -        (unspec:<VNARROWQ2> [(match_operand:<VNARROWQ> 1 "register_operand" 
> "0")
> -                          (match_operand:VQN 2 "register_operand" "w")
> -                          (match_operand:SI 3 
> "aarch64_simd_shift_imm_offset_<ve_mode>" "i")]
> -                            VQSHRN_N))]
> -  "TARGET_SIMD"
> +     (vec_concat:<VNARROWQ2>
> +       (match_operand:<VNARROWQ> 1 "register_operand" "0")
> +       (unspec:<VNARROWQ> [(match_operand:VQN 2 "register_operand" "w")
> +                           (match_operand:VQN 3
> +                             "aarch64_simd_shift_imm_vec_<vn_mode>")]
> +                          VQSHRN_N)))]
> +  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
>    "<sur>q<r>shr<u>n2\\t%<vn2>0.<V2ntype>, %<v>2.<Vtype>, %3"
>    [(set_attr "type" "neon_sat_shift_imm_narrow_q")]
>  )
>  
> +(define_insn "aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_be"
> +  [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w")
> +     (vec_concat:<VNARROWQ2>
> +          (unspec:<VNARROWQ> [(match_operand:VQN 2 "register_operand" "w")
> +                           (match_operand:VQN 3
> +                             "aarch64_simd_shift_imm_vec_<vn_mode>")]
> +                          VQSHRN_N)
> +       (match_operand:<VNARROWQ> 1 "register_operand" "0")))]
> +  "TARGET_SIMD && BYTES_BIG_ENDIAN"
> +  "<sur>q<r>shr<u>n2\\t%<vn2>0.<V2ntype>, %<v>2.<Vtype>, %3"
> +  [(set_attr "type" "neon_sat_shift_imm_narrow_q")]
> +)
> +
> +(define_expand "aarch64_<sur>q<r>shr<u>n2_n<mode>"
> +  [(match_operand:<VNARROWQ2> 0 "register_operand")
> +   (match_operand:<VNARROWQ> 1 "register_operand")
> +   (unspec:<VNARROWQ>
> +     [(match_operand:VQN 2 "register_operand")
> +      (match_operand:SI 3 "aarch64_simd_shift_imm_offset_<vn_mode>")]
> +        VQSHRN_N)]
> +  "TARGET_SIMD"
> +  {
> +    operands[3] = aarch64_simd_gen_const_vector_dup (<MODE>mode,
> +                                              INTVAL (operands[3]));
> +
> +    if (BYTES_BIG_ENDIAN)
> +      emit_insn (gen_aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_be (operands[0],
> +                             operands[1], operands[2], operands[3]));
> +    else
> +      emit_insn (gen_aarch64_<sur>q<r>shr<u>n2_n<mode>_insn_le (operands[0],
> +                             operands[1], operands[2], operands[3]));
> +    DONE;
> +  }
> +)
> +
>  
>  ;; cm(eq|ge|gt|lt|le)
>  ;; Note, we have constraints for Dz and Z as different expanders

Reply via email to