On Thu, Sep 5, 2024 at 4:30 PM Victor Do Nascimento <victor.donascime...@arm.com> wrote: > > > Changes from previous revision: > > As was done for the equivalent aarch64 patch, we rework this patch to do away > with > mission creep, keeping changes as simple as possible. > > We thus remove the `gimple_fold_builtin' changes that would have replaced the > dot-product builtin calls with DOT_PROD_EXPRs as well as the novel > initialization > mechanism for dot-product builtins, choosing instead to redirect the > single-mode > CODE_FOR_neon_(u|s|us)dot* values generated from `arm_neon_builtins.def' to > their > new 2-mode equivalents. > > Regression tested on arm-none-linux-gnueabihf, no new failures identified.
Presumably with a suitable v8.x-a as default ensuring that the patterns are tested ? i.e. vusdot-autovec.c passes in your testing ? If so, yeah Ok if no regressions. regards Ramana > > ------ > > Given recent changes to the dot_prod standard pattern name, this patch > fixes the arm back-end by implementing the following changes: > > 1. Add 2nd mode to all patterns relating to the dot-product in .md > files. > 2. redirect the single-mode CODE_FOR_neon_(u|s|us)dot<mode> values > generated from `arm_neon_builtins.def' to their new 2-mode > equivalents via means of simple aliases, as per the following example: > > constexpr insn_code CODE_FOR_neon_sdotv8qi > = CODE_FOR_neon_sdotv2siv8qi; > > gcc/ChangeLog: > > * config/arm/neon.md (<sup>dot_prod<vsi2qi>): Renamed to... > (<sup>dot_prod<mode><vsi2qi>): ...this. > (neon_<sup>dot<vsi2qi>): Renamed to... > (neon_<sup>dot<mode><vsi2qi>): ...this. > (neon_usdot<vsi2qi>): Renamed to... > (neon_usdot<mode><vsi2qi>): ...this. > (usdot_prod<vsi2qi>): Renamed to... > (usdot_prod<mode><vsi2qi>): ...this. > * config/arm/arm-builtins.cc > (CODE_FOR_neon_sdotv8qi): Definie as alias to > new CODE_FOR_neon_sdotv2siv8qi. > (CODE_FOR_neon_udotv8qi): Definie as alias to > new CODE_FOR_neon_udotv2siv8qi. > (CODE_FOR_neon_usdotv8qi): Definie as alias to > new CODE_FOR_neon_usdotv2siv8qi. > (CODE_FOR_neon_sdotv16qi): Definie as alias to > new CODE_FOR_neon_sdotv4siv16qi. > (CODE_FOR_neon_udotv16qi): Definie as alias to > new CODE_FOR_neon_udotv4siv16qi. > (CODE_FOR_neon_usdotv16qi): Definie as alias to > new CODE_FOR_neon_usdotv4siv16qi. > --- > gcc/config/arm/arm-builtins.cc | 7 +++++++ > gcc/config/arm/neon.md | 8 ++++---- > 2 files changed, 11 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc > index c9d50bf8fbb..74cea8900b4 100644 > --- a/gcc/config/arm/arm-builtins.cc > +++ b/gcc/config/arm/arm-builtins.cc > @@ -908,6 +908,13 @@ typedef struct { > enum arm_type_qualifiers *qualifiers; > } arm_builtin_datum; > > +constexpr insn_code CODE_FOR_neon_sdotv8qi = CODE_FOR_neon_sdotv2siv8qi; > +constexpr insn_code CODE_FOR_neon_udotv8qi = CODE_FOR_neon_udotv2siv8qi; > +constexpr insn_code CODE_FOR_neon_usdotv8qi = CODE_FOR_neon_usdotv2siv8qi; > +constexpr insn_code CODE_FOR_neon_sdotv16qi = CODE_FOR_neon_sdotv4siv16qi; > +constexpr insn_code CODE_FOR_neon_udotv16qi = CODE_FOR_neon_udotv4siv16qi; > +constexpr insn_code CODE_FOR_neon_usdotv16qi = CODE_FOR_neon_usdotv4siv16qi; > + > #define CF(N,X) CODE_FOR_neon_##N##X > > #define VAR1(T, N, A) \ > diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md > index fa4a7aeda35..6892b7b0f44 100644 > --- a/gcc/config/arm/neon.md > +++ b/gcc/config/arm/neon.md > @@ -2989,7 +2989,7 @@ (define_expand "cmul<conj_op><mode>3" > ;; ... > ;; > ;; and so the vectorizer provides r, in which the result has to be > accumulated. > -(define_insn "<sup>dot_prod<vsi2qi>" > +(define_insn "<sup>dot_prod<mode><vsi2qi>" > [(set (match_operand:VCVTI 0 "register_operand" "=w") > (plus:VCVTI > (unspec:VCVTI [(match_operand:<VSI2QI> 1 "register_operand" "w") > @@ -3002,7 +3002,7 @@ (define_insn "<sup>dot_prod<vsi2qi>" > ) > > ;; These instructions map to the __builtins for the Dot Product operations > -(define_expand "neon_<sup>dot<vsi2qi>" > +(define_expand "neon_<sup>dot<mode><vsi2qi>" > [(set (match_operand:VCVTI 0 "register_operand" "=w") > (plus:VCVTI > (unspec:VCVTI [(match_operand:<VSI2QI> 2 "register_operand") > @@ -3013,7 +3013,7 @@ (define_expand "neon_<sup>dot<vsi2qi>" > ) > > ;; These instructions map to the __builtins for the Dot Product operations. > -(define_insn "neon_usdot<vsi2qi>" > +(define_insn "neon_usdot<mode><vsi2qi>" > [(set (match_operand:VCVTI 0 "register_operand" "=w") > (plus:VCVTI > (unspec:VCVTI > @@ -3112,7 +3112,7 @@ (define_insn "neon_<sup>dot_laneq<vsi2qi>" > ) > > ;; Auto-vectorizer pattern for usdot > -(define_expand "usdot_prod<vsi2qi>" > +(define_expand "usdot_prod<mode><vsi2qi>" > [(set (match_operand:VCVTI 0 "register_operand") > (plus:VCVTI (unspec:VCVTI [(match_operand:<VSI2QI> 1 > "register_operand") > -- > 2.34.1 >