On Thu, Sep 5, 2024 at 4:30 PM Victor Do Nascimento
<victor.donascime...@arm.com> wrote:
>
>
> Changes from previous revision:
>
> As was done for the equivalent aarch64 patch, we rework this patch to do away 
> with
> mission creep, keeping changes as simple as possible.
>
> We thus remove the `gimple_fold_builtin' changes that would have replaced the
> dot-product builtin calls with DOT_PROD_EXPRs as well as the novel 
> initialization
> mechanism for dot-product builtins, choosing instead to redirect the 
> single-mode
> CODE_FOR_neon_(u|s|us)dot* values generated from `arm_neon_builtins.def' to 
> their
> new 2-mode equivalents.
>
> Regression tested on arm-none-linux-gnueabihf, no new failures identified.


Presumably with a suitable v8.x-a as default ensuring that the
patterns are tested ? i.e. vusdot-autovec.c passes in your testing ?

If so, yeah Ok if no regressions.


regards
Ramana
>
> ------
>
> Given recent changes to the dot_prod standard pattern name, this patch
> fixes the arm back-end by implementing the following changes:
>
> 1. Add 2nd mode to all patterns relating to the dot-product in .md
> files.
> 2. redirect the single-mode CODE_FOR_neon_(u|s|us)dot<mode> values
> generated from `arm_neon_builtins.def' to their new 2-mode
> equivalents via means of simple aliases, as per the following example:
>
>   constexpr insn_code CODE_FOR_neon_sdotv8qi
>     = CODE_FOR_neon_sdotv2siv8qi;
>
> gcc/ChangeLog:
>
>         * config/arm/neon.md (<sup>dot_prod<vsi2qi>): Renamed to...
>         (<sup>dot_prod<mode><vsi2qi>): ...this.
>         (neon_<sup>dot<vsi2qi>): Renamed to...
>         (neon_<sup>dot<mode><vsi2qi>): ...this.
>         (neon_usdot<vsi2qi>): Renamed to...
>         (neon_usdot<mode><vsi2qi>): ...this.
>         (usdot_prod<vsi2qi>): Renamed to...
>         (usdot_prod<mode><vsi2qi>): ...this.
>         * config/arm/arm-builtins.cc
>         (CODE_FOR_neon_sdotv8qi): Definie as alias to
>         new CODE_FOR_neon_sdotv2siv8qi.
>         (CODE_FOR_neon_udotv8qi): Definie as alias to
>         new CODE_FOR_neon_udotv2siv8qi.
>         (CODE_FOR_neon_usdotv8qi): Definie as alias to
>         new CODE_FOR_neon_usdotv2siv8qi.
>         (CODE_FOR_neon_sdotv16qi): Definie as alias to
>         new CODE_FOR_neon_sdotv4siv16qi.
>         (CODE_FOR_neon_udotv16qi): Definie as alias to
>         new CODE_FOR_neon_udotv4siv16qi.
>         (CODE_FOR_neon_usdotv16qi): Definie as alias to
>         new CODE_FOR_neon_usdotv4siv16qi.
> ---
>  gcc/config/arm/arm-builtins.cc | 7 +++++++
>  gcc/config/arm/neon.md         | 8 ++++----
>  2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
> index c9d50bf8fbb..74cea8900b4 100644
> --- a/gcc/config/arm/arm-builtins.cc
> +++ b/gcc/config/arm/arm-builtins.cc
> @@ -908,6 +908,13 @@ typedef struct {
>    enum arm_type_qualifiers *qualifiers;
>  } arm_builtin_datum;
>
> +constexpr insn_code CODE_FOR_neon_sdotv8qi = CODE_FOR_neon_sdotv2siv8qi;
> +constexpr insn_code CODE_FOR_neon_udotv8qi = CODE_FOR_neon_udotv2siv8qi;
> +constexpr insn_code CODE_FOR_neon_usdotv8qi = CODE_FOR_neon_usdotv2siv8qi;
> +constexpr insn_code CODE_FOR_neon_sdotv16qi = CODE_FOR_neon_sdotv4siv16qi;
> +constexpr insn_code CODE_FOR_neon_udotv16qi = CODE_FOR_neon_udotv4siv16qi;
> +constexpr insn_code CODE_FOR_neon_usdotv16qi = CODE_FOR_neon_usdotv4siv16qi;
> +
>  #define CF(N,X) CODE_FOR_neon_##N##X
>
>  #define VAR1(T, N, A) \
> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
> index fa4a7aeda35..6892b7b0f44 100644
> --- a/gcc/config/arm/neon.md
> +++ b/gcc/config/arm/neon.md
> @@ -2989,7 +2989,7 @@ (define_expand "cmul<conj_op><mode>3"
>  ;; ...
>  ;;
>  ;; and so the vectorizer provides r, in which the result has to be 
> accumulated.
> -(define_insn "<sup>dot_prod<vsi2qi>"
> +(define_insn "<sup>dot_prod<mode><vsi2qi>"
>    [(set (match_operand:VCVTI 0 "register_operand" "=w")
>         (plus:VCVTI
>           (unspec:VCVTI [(match_operand:<VSI2QI> 1 "register_operand" "w")
> @@ -3002,7 +3002,7 @@ (define_insn "<sup>dot_prod<vsi2qi>"
>  )
>
>  ;; These instructions map to the __builtins for the Dot Product operations
> -(define_expand "neon_<sup>dot<vsi2qi>"
> +(define_expand "neon_<sup>dot<mode><vsi2qi>"
>    [(set (match_operand:VCVTI 0 "register_operand" "=w")
>         (plus:VCVTI
>           (unspec:VCVTI [(match_operand:<VSI2QI> 2 "register_operand")
> @@ -3013,7 +3013,7 @@ (define_expand "neon_<sup>dot<vsi2qi>"
>  )
>
>  ;; These instructions map to the __builtins for the Dot Product operations.
> -(define_insn "neon_usdot<vsi2qi>"
> +(define_insn "neon_usdot<mode><vsi2qi>"
>    [(set (match_operand:VCVTI 0 "register_operand" "=w")
>         (plus:VCVTI
>           (unspec:VCVTI
> @@ -3112,7 +3112,7 @@ (define_insn "neon_<sup>dot_laneq<vsi2qi>"
>  )
>
>  ;; Auto-vectorizer pattern for usdot
> -(define_expand "usdot_prod<vsi2qi>"
> +(define_expand "usdot_prod<mode><vsi2qi>"
>    [(set (match_operand:VCVTI 0 "register_operand")
>         (plus:VCVTI (unspec:VCVTI [(match_operand:<VSI2QI> 1
>                                                         "register_operand")
> --
> 2.34.1
>

Reply via email to