Hi Richard, > -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: 16 September 2020 11:15 > To: gcc-patches@gcc.gnu.org > Cc: ni...@redhat.com; Richard Earnshaw <richard.earns...@arm.com>; > Ramana Radhakrishnan <ramana.radhakrish...@arm.com>; Kyrylo > Tkachov <kyrylo.tkac...@arm.com>; Dennis Zhang > <dennis.zh...@arm.com> > Subject: Re: [PATCH] arm: Add new vector mode macros > > Ping > > Richard Sandiford <richard.sandif...@arm.com> writes: > > [ This is related to Dennis's subtraction patch > > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553339.html > > and the discussion about how the patterns were written. I wanted > > to see whether there was a way that we could simplify the current > > addition handling that might perhaps make it easier to add other > > MVE operations in future. It seemed like one of those situations > > in which the most productive thing would be to try it and see, > > rather than just describe it in words. > > > > One of the questions Ramana had in the thread above was: why does > > MVE not need the flag_unsafe_math_optimizations flag? AIUI the reason > > is that MVE honours the FPSCR.FZ flag while SF Advanced SIMD always > > flushes to zero. (HF Advanced SIMD honours FPSCR.FZ16 and so also > > doesn't need flag_unsafe_math_optimizations.) ] > > > > The AArch32 port now has three vector extensions: iwMMXt, Neon > > and MVE. We already have some named expanders that are shared > > by all three, and soon we'll need more. > > > > One way of handling this would be to use define_mode_iterators > > that specify the condition for each mode. For example, > > > > (V16QI "TARGET_NEON || TARGET_HAVE_MVE") > > (V8QI "TARGET_NEON || TARGET_REALLY_IWMXXT") > > ... > > (V2SF "TARGET_NEON && flag_unsafe_math_optimizations") > > > > etc. However, we'll need several mode iterators, and it would > > be repetitive to specify the mode condition every time. > > > > This patch therefore introduces per-mode macros that say whether > > we can perform general arithmetic on the mode. Initially there are > > two sets of macros: > > > > ARM_HAVE_NEON_<MODE>_ARITH > > true if Neon can handle general arithmetic on <MODE> > > > > ARM_HAVE_<MODE>_ARITH > > true if any vector extension can handle general arithmetic on <MODE> > > > > The macro definitions themselves are undeniably ugly, but hopefully > > they're justified by the simplifications they allow. > > > > The patch converts the addition patterns to use this scheme. > > > > Previously there were three copies of the V8HF and V4HF addition > > patterns for Neon: > > > > (1) *add<VDQ:mode>3_neon, which provided plus:VnHF even without > > TARGET_NEON_FP16INST. This was probably harmless since all the > > named patterns had an appropriate guard, but it is possible that > > something could have tried to generate the plus directly, such as > > by using a REG_EQUAL note to generate a new pattern. > > > > (2) addv8hf3_neon and addv4hf3, which had the correct > > TARGET_NEON_FP16INST target condition, but unnecessarily required > > flag_unsafe_math_optimizations. Unlike VnSF operations, VnHF > > operations do not force flush to zero. > > > > (3) add<VH:mode>3_fp16, which provided plus:VnHF with the > > correct conditions (TARGET_NEON_FP16INST, with no > > flag_unsafe_math_optimizations test). > > > > The patch in essence renames add<VH:mode>3_fp16 to > *add<VH:mode>3_neon > > (part of *add<VDQ:mode>3_neon) and removes the other two patterns. > > > > WDYT? Does this look like a way forward?
This looks like a productive way forward to me. Okay if the other maintainer don't object by the end of the week. Thanks, Kyrill > > > > Tested on arm-linux-gnueabihf and armeb-eabi. > > > > Thanks, > > Richard > > > > > > gcc/ > > * config/arm/arm.h (ARM_HAVE_NEON_V8QI_ARITH, > ARM_HAVE_NEON_V4HI_ARITH) > > (ARM_HAVE_NEON_V2SI_ARITH, ARM_HAVE_NEON_V16QI_ARITH): > New macros. > > (ARM_HAVE_NEON_V8HI_ARITH, ARM_HAVE_NEON_V4SI_ARITH): > Likewise. > > (ARM_HAVE_NEON_V2DI_ARITH, ARM_HAVE_NEON_V4HF_ARITH): > Likewise. > > (ARM_HAVE_NEON_V8HF_ARITH, ARM_HAVE_NEON_V2SF_ARITH): > Likewise. > > (ARM_HAVE_NEON_V4SF_ARITH, ARM_HAVE_V8QI_ARITH, > ARM_HAVE_V4HI_ARITH) > > (ARM_HAVE_V2SI_ARITH, ARM_HAVE_V16QI_ARITH, > ARM_HAVE_V8HI_ARITH) > > (ARM_HAVE_V4SI_ARITH, ARM_HAVE_V2DI_ARITH, > ARM_HAVE_V4HF_ARITH) > > (ARM_HAVE_V2SF_ARITH, ARM_HAVE_V8HF_ARITH, > ARM_HAVE_V4SF_ARITH): > > Likewise. > > * config/arm/iterators.md (VNIM, VNINOTM): Delete. > > * config/arm/vec-common.md (add<VNIM:mode>3, addv8hf3) > > (add<VNINOTM:mode>3): Replace with... > > (add<VDQ:mode>3): ...this new expander. > > * config/arm/neon.md (*add<VDQ:mode>3_neon): Use the new > > ARM_HAVE_NEON_<MODE>_ARITH macros as the C condition. > > (addv8hf3_neon, addv4hf3, add<VFH:mode>3_fp16): Delete in > > favor of the above. > > (neon_vadd<VH:mode>): Use gen_add<mode>3 instead of > > gen_add<mode>3_fp16. > > --- > > gcc/config/arm/arm.h | 41 +++++++++++++++++++++++++++++++ > > gcc/config/arm/iterators.md | 8 ------ > > gcc/config/arm/neon.md | 47 ++---------------------------------- > > gcc/config/arm/vec-common.md | 42 ++++---------------------------- > > 4 files changed, 48 insertions(+), 90 deletions(-) > > > > diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h > > index 3887c51eebe..3284ae29d7c 100644 > > --- a/gcc/config/arm/arm.h > > +++ b/gcc/config/arm/arm.h > > @@ -1106,6 +1106,47 @@ extern const int arm_arch_cde_coproc_bits[]; > > #define VALID_MVE_STRUCT_MODE(MODE) \ > > ((MODE) == TImode || (MODE) == OImode || (MODE) == XImode) > > > > +/* The conditions under which vector modes are supported for general > > + arithmetic using Neon. */ > > + > > +#define ARM_HAVE_NEON_V8QI_ARITH TARGET_NEON > > +#define ARM_HAVE_NEON_V4HI_ARITH TARGET_NEON > > +#define ARM_HAVE_NEON_V2SI_ARITH TARGET_NEON > > + > > +#define ARM_HAVE_NEON_V16QI_ARITH TARGET_NEON > > +#define ARM_HAVE_NEON_V8HI_ARITH TARGET_NEON > > +#define ARM_HAVE_NEON_V4SI_ARITH TARGET_NEON > > +#define ARM_HAVE_NEON_V2DI_ARITH TARGET_NEON > > + > > +/* HF operations have their own flush-to-zero control (FPSCR.FZ16). */ > > +#define ARM_HAVE_NEON_V4HF_ARITH TARGET_NEON_FP16INST > > +#define ARM_HAVE_NEON_V8HF_ARITH TARGET_NEON_FP16INST > > + > > +/* SF operations always flush to zero, regardless of FPSCR.FZ, so we can > > + only use them for general arithmetic when -funsafe-math-optimizations > > + is in effect. */ > > +#define ARM_HAVE_NEON_V2SF_ARITH \ > > + (TARGET_NEON && flag_unsafe_math_optimizations) > > +#define ARM_HAVE_NEON_V4SF_ARITH ARM_HAVE_NEON_V2SF_ARITH > > + > > +/* The conditions under which vector modes are supported for general > > + arithmetic by any vector extension. */ > > + > > +#define ARM_HAVE_V8QI_ARITH (ARM_HAVE_NEON_V8QI_ARITH || > TARGET_REALLY_IWMMXT) > > +#define ARM_HAVE_V4HI_ARITH (ARM_HAVE_NEON_V4HI_ARITH || > TARGET_REALLY_IWMMXT) > > +#define ARM_HAVE_V2SI_ARITH (ARM_HAVE_NEON_V2SI_ARITH || > TARGET_REALLY_IWMMXT) > > + > > +#define ARM_HAVE_V16QI_ARITH (ARM_HAVE_NEON_V16QI_ARITH || > TARGET_HAVE_MVE) > > +#define ARM_HAVE_V8HI_ARITH (ARM_HAVE_NEON_V8HI_ARITH || > TARGET_HAVE_MVE) > > +#define ARM_HAVE_V4SI_ARITH (ARM_HAVE_NEON_V4SI_ARITH || > TARGET_HAVE_MVE) > > +#define ARM_HAVE_V2DI_ARITH ARM_HAVE_NEON_V2DI_ARITH > > + > > +#define ARM_HAVE_V4HF_ARITH ARM_HAVE_NEON_V4HF_ARITH > > +#define ARM_HAVE_V2SF_ARITH ARM_HAVE_NEON_V2SF_ARITH > > + > > +#define ARM_HAVE_V8HF_ARITH (ARM_HAVE_NEON_V8HF_ARITH || > TARGET_HAVE_MVE_FLOAT) > > +#define ARM_HAVE_V4SF_ARITH (ARM_HAVE_NEON_V4SF_ARITH || > TARGET_HAVE_MVE_FLOAT) > > + > > /* The register numbers in sequence, for passing to > arm_gen_load_multiple. */ > > extern int arm_regs_in_sequence[]; > > > > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md > > index 0bc9eba0722..c70e3bc2731 100644 > > --- a/gcc/config/arm/iterators.md > > +++ b/gcc/config/arm/iterators.md > > @@ -66,14 +66,6 @@ (define_mode_iterator VSHFT [V4HI V2SI DI]) > > ;; Integer and float modes supported by Neon and IWMMXT. > > (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI > V4SF]) > > > > -;; Integer and float modes supported by Neon, IWMMXT and MVE, used by > > -;; arithmetic epxand patterns. > > -(define_mode_iterator VNIM [V16QI V8HI V4SI V4SF]) > > - > > -;; Integer and float modes supported by Neon and IWMMXT but not MVE, > used by > > -;; arithmetic epxand patterns. > > -(define_mode_iterator VNINOTM [V2SI V4HI V8QI V2SF V2DI]) > > - > > ;; Integer and float modes supported by Neon, IWMMXT and MVE. > > (define_mode_iterator VNIM1 [V16QI V8HI V4SI V4SF V2DI]) > > > > diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md > > index 3e7b51d8ab6..96bf277f501 100644 > > --- a/gcc/config/arm/neon.md > > +++ b/gcc/config/arm/neon.md > > @@ -501,7 +501,7 @@ (define_insn "*add<mode>3_neon" > > [(set (match_operand:VDQ 0 "s_register_operand" "=w") > > (plus:VDQ (match_operand:VDQ 1 "s_register_operand" "w") > > (match_operand:VDQ 2 "s_register_operand" "w")))] > > - "TARGET_NEON && (!<Is_float_mode> || > flag_unsafe_math_optimizations)" > > + "ARM_HAVE_NEON_<MODE>_ARITH" > > "vadd.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" > > [(set (attr "type") > > (if_then_else (match_test "<Is_float_mode>") > > @@ -509,49 +509,6 @@ (define_insn "*add<mode>3_neon" > > (const_string "neon_add<q>")))] > > ) > > > > -;; As with SFmode, full support for HFmode vector arithmetic is only > available > > -;; when flag-unsafe-math-optimizations is enabled. > > - > > -;; Add pattern with modes V8HF and V4HF is split into separate patterns to > add > > -;; support for standard pattern addv8hf3 in MVE. Following pattern is > called > > -;; from "addv8hf3" standard pattern inside vec-common.md file. > > - > > -(define_insn "addv8hf3_neon" > > - [(set > > - (match_operand:V8HF 0 "s_register_operand" "=w") > > - (plus:V8HF > > - (match_operand:V8HF 1 "s_register_operand" "w") > > - (match_operand:V8HF 2 "s_register_operand" "w")))] > > - "TARGET_NEON_FP16INST && flag_unsafe_math_optimizations" > > - "vadd.f16\t%<V_reg>0, %<V_reg>1, %<V_reg>2" > > - [(set_attr "type" "neon_fp_addsub_s_q")] > > -) > > - > > -(define_insn "addv4hf3" > > - [(set > > - (match_operand:V4HF 0 "s_register_operand" "=w") > > - (plus:V4HF > > - (match_operand:V4HF 1 "s_register_operand" "w") > > - (match_operand:V4HF 2 "s_register_operand" "w")))] > > - "TARGET_NEON_FP16INST && flag_unsafe_math_optimizations" > > - "vadd.f16\t%<V_reg>0, %<V_reg>1, %<V_reg>2" > > - [(set_attr "type" "neon_fp_addsub_s_q")] > > -) > > - > > -(define_insn "add<mode>3_fp16" > > - [(set > > - (match_operand:VH 0 "s_register_operand" "=w") > > - (plus:VH > > - (match_operand:VH 1 "s_register_operand" "w") > > - (match_operand:VH 2 "s_register_operand" "w")))] > > - "TARGET_NEON_FP16INST" > > - "vadd.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" > > - [(set (attr "type") > > - (if_then_else (match_test "<Is_float_mode>") > > - (const_string "neon_fp_addsub_s<q>") > > - (const_string "neon_add<q>")))] > > -) > > - > > (define_insn "*sub<mode>3_neon" > > [(set (match_operand:VDQ 0 "s_register_operand" "=w") > > (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w") > > @@ -1837,7 +1794,7 @@ (define_expand "neon_vadd<mode>" > > (match_operand:VH 2 "s_register_operand")] > > "TARGET_NEON_FP16INST" > > { > > - emit_insn (gen_add<mode>3_fp16 (operands[0], operands[1], > operands[2])); > > + emit_insn (gen_add<mode>3 (operands[0], operands[1], operands[2])); > > DONE; > > }) > > > > diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec- > common.md > > index b7e3619caf4..c3c86c46355 100644 > > --- a/gcc/config/arm/vec-common.md > > +++ b/gcc/config/arm/vec-common.md > > @@ -81,43 +81,11 @@ (define_expand "movv8hf" > > ;; patterns separately for Neon, IWMMXT and MVE. > > > > (define_expand "add<mode>3" > > - [(set (match_operand:VNIM 0 "s_register_operand") > > - (plus:VNIM (match_operand:VNIM 1 "s_register_operand") > > - (match_operand:VNIM 2 "s_register_operand")))] > > - "(TARGET_NEON && ((<MODE>mode != V2SFmode && <MODE>mode != > V4SFmode) > > - || flag_unsafe_math_optimizations)) > > - || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE > (<MODE>mode)) > > - || (TARGET_HAVE_MVE && VALID_MVE_SI_MODE(<MODE>mode)) > > - || (TARGET_HAVE_MVE_FLOAT && > VALID_MVE_SF_MODE(<MODE>mode))" > > -{ > > -}) > > - > > -;; Vector arithmetic. Expanders are blank, then unnamed insns implement > > -;; patterns separately for Neon and MVE. > > - > > -(define_expand "addv8hf3" > > - [(set (match_operand:V8HF 0 "s_register_operand") > > - (plus:V8HF (match_operand:V8HF 1 "s_register_operand") > > - (match_operand:V8HF 2 "s_register_operand")))] > > - "(TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE(V8HFmode)) > > - || (TARGET_NEON_FP16INST && flag_unsafe_math_optimizations)" > > -{ > > - if (TARGET_NEON_FP16INST && flag_unsafe_math_optimizations) > > - emit_insn (gen_addv8hf3_neon (operands[0], operands[1], > operands[2])); > > -}) > > - > > -;; Vector arithmetic. Expanders are blank, then unnamed insns implement > > -;; patterns separately for Neon and IWMMXT. > > - > > -(define_expand "add<mode>3" > > - [(set (match_operand:VNINOTM 0 "s_register_operand") > > - (plus:VNINOTM (match_operand:VNINOTM 1 "s_register_operand") > > - (match_operand:VNINOTM 2 "s_register_operand")))] > > - "(TARGET_NEON && ((<MODE>mode != V2SFmode && <MODE>mode != > V4SFmode) > > - || flag_unsafe_math_optimizations)) > > - || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE > (<MODE>mode))" > > -{ > > -}) > > + [(set (match_operand:VDQ 0 "s_register_operand") > > + (plus:VDQ (match_operand:VDQ 1 "s_register_operand") > > + (match_operand:VDQ 2 "s_register_operand")))] > > + "ARM_HAVE_<MODE>_ARITH" > > +) > > > > ;; Vector arithmetic. Expanders are blank, then unnamed insns implement > > ;; patterns separately for IWMMXT and Neon.