Ping

Richard Sandiford <richard.sandif...@arm.com> writes:
> [ This is related to Dennis's subtraction patch
>   https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553339.html
>   and the discussion about how the patterns were written.  I wanted
>   to see whether there was a way that we could simplify the current
>   addition handling that might perhaps make it easier to add other
>   MVE operations in future.  It seemed like one of those situations
>   in which the most productive thing would be to try it and see,
>   rather than just describe it in words.
>
>   One of the questions Ramana had in the thread above was: why does
>   MVE not need the flag_unsafe_math_optimizations flag?  AIUI the reason
>   is that MVE honours the FPSCR.FZ flag while SF Advanced SIMD always
>   flushes to zero.  (HF Advanced SIMD honours FPSCR.FZ16 and so also
>   doesn't need flag_unsafe_math_optimizations.) ]
>
> The AArch32 port now has three vector extensions: iwMMXt, Neon
> and MVE.  We already have some named expanders that are shared
> by all three, and soon we'll need more.
>
> One way of handling this would be to use define_mode_iterators
> that specify the condition for each mode.  For example,
>
>   (V16QI "TARGET_NEON || TARGET_HAVE_MVE")
>   (V8QI "TARGET_NEON || TARGET_REALLY_IWMXXT")
>   ...
>   (V2SF "TARGET_NEON && flag_unsafe_math_optimizations")
>
> etc.  However, we'll need several mode iterators, and it would
> be repetitive to specify the mode condition every time.
>
> This patch therefore introduces per-mode macros that say whether
> we can perform general arithmetic on the mode.  Initially there are
> two sets of macros:
>
> ARM_HAVE_NEON_<MODE>_ARITH
>   true if Neon can handle general arithmetic on <MODE>
>
> ARM_HAVE_<MODE>_ARITH
>   true if any vector extension can handle general arithmetic on <MODE>
>
> The macro definitions themselves are undeniably ugly, but hopefully
> they're justified by the simplifications they allow.
>
> The patch converts the addition patterns to use this scheme.
>
> Previously there were three copies of the V8HF and V4HF addition
> patterns for Neon:
>
> (1) *add<VDQ:mode>3_neon, which provided plus:VnHF even without
>     TARGET_NEON_FP16INST.  This was probably harmless since all the
>     named patterns had an appropriate guard, but it is possible that
>     something could have tried to generate the plus directly, such as
>     by using a REG_EQUAL note to generate a new pattern.
>
> (2) addv8hf3_neon and addv4hf3, which had the correct
>     TARGET_NEON_FP16INST target condition, but unnecessarily required
>     flag_unsafe_math_optimizations.  Unlike VnSF operations, VnHF
>     operations do not force flush to zero.
>
> (3) add<VH:mode>3_fp16, which provided plus:VnHF with the
>     correct conditions (TARGET_NEON_FP16INST, with no
>     flag_unsafe_math_optimizations test).
>
> The patch in essence renames add<VH:mode>3_fp16 to *add<VH:mode>3_neon
> (part of *add<VDQ:mode>3_neon) and removes the other two patterns.
>
> WDYT?  Does this look like a way forward?
>
> Tested on arm-linux-gnueabihf and armeb-eabi.
>
> Thanks,
> Richard
>
>
> gcc/
>       * config/arm/arm.h (ARM_HAVE_NEON_V8QI_ARITH, ARM_HAVE_NEON_V4HI_ARITH)
>       (ARM_HAVE_NEON_V2SI_ARITH, ARM_HAVE_NEON_V16QI_ARITH): New macros.
>       (ARM_HAVE_NEON_V8HI_ARITH, ARM_HAVE_NEON_V4SI_ARITH): Likewise.
>       (ARM_HAVE_NEON_V2DI_ARITH, ARM_HAVE_NEON_V4HF_ARITH): Likewise.
>       (ARM_HAVE_NEON_V8HF_ARITH, ARM_HAVE_NEON_V2SF_ARITH): Likewise.
>       (ARM_HAVE_NEON_V4SF_ARITH, ARM_HAVE_V8QI_ARITH, ARM_HAVE_V4HI_ARITH)
>       (ARM_HAVE_V2SI_ARITH, ARM_HAVE_V16QI_ARITH, ARM_HAVE_V8HI_ARITH)
>       (ARM_HAVE_V4SI_ARITH, ARM_HAVE_V2DI_ARITH, ARM_HAVE_V4HF_ARITH)
>       (ARM_HAVE_V2SF_ARITH, ARM_HAVE_V8HF_ARITH, ARM_HAVE_V4SF_ARITH):
>       Likewise.
>       * config/arm/iterators.md (VNIM, VNINOTM): Delete.
>       * config/arm/vec-common.md (add<VNIM:mode>3, addv8hf3)
>       (add<VNINOTM:mode>3): Replace with...
>       (add<VDQ:mode>3): ...this new expander.
>       * config/arm/neon.md (*add<VDQ:mode>3_neon): Use the new
>       ARM_HAVE_NEON_<MODE>_ARITH macros as the C condition.
>       (addv8hf3_neon, addv4hf3, add<VFH:mode>3_fp16): Delete in
>       favor of the above.
>       (neon_vadd<VH:mode>): Use gen_add<mode>3 instead of
>       gen_add<mode>3_fp16.
> ---
>  gcc/config/arm/arm.h         | 41 +++++++++++++++++++++++++++++++
>  gcc/config/arm/iterators.md  |  8 ------
>  gcc/config/arm/neon.md       | 47 ++----------------------------------
>  gcc/config/arm/vec-common.md | 42 ++++----------------------------
>  4 files changed, 48 insertions(+), 90 deletions(-)
>
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 3887c51eebe..3284ae29d7c 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -1106,6 +1106,47 @@ extern const int arm_arch_cde_coproc_bits[];
>  #define VALID_MVE_STRUCT_MODE(MODE) \
>    ((MODE) == TImode || (MODE) == OImode || (MODE) == XImode)
>  
> +/* The conditions under which vector modes are supported for general
> +   arithmetic using Neon.  */
> +
> +#define ARM_HAVE_NEON_V8QI_ARITH TARGET_NEON
> +#define ARM_HAVE_NEON_V4HI_ARITH TARGET_NEON
> +#define ARM_HAVE_NEON_V2SI_ARITH TARGET_NEON
> +
> +#define ARM_HAVE_NEON_V16QI_ARITH TARGET_NEON
> +#define ARM_HAVE_NEON_V8HI_ARITH TARGET_NEON
> +#define ARM_HAVE_NEON_V4SI_ARITH TARGET_NEON
> +#define ARM_HAVE_NEON_V2DI_ARITH TARGET_NEON
> +
> +/* HF operations have their own flush-to-zero control (FPSCR.FZ16).  */
> +#define ARM_HAVE_NEON_V4HF_ARITH TARGET_NEON_FP16INST
> +#define ARM_HAVE_NEON_V8HF_ARITH TARGET_NEON_FP16INST
> +
> +/* SF operations always flush to zero, regardless of FPSCR.FZ, so we can
> +   only use them for general arithmetic when -funsafe-math-optimizations
> +   is in effect.  */
> +#define ARM_HAVE_NEON_V2SF_ARITH \
> +  (TARGET_NEON && flag_unsafe_math_optimizations)
> +#define ARM_HAVE_NEON_V4SF_ARITH ARM_HAVE_NEON_V2SF_ARITH
> +
> +/* The conditions under which vector modes are supported for general
> +   arithmetic by any vector extension.  */
> +
> +#define ARM_HAVE_V8QI_ARITH (ARM_HAVE_NEON_V8QI_ARITH || 
> TARGET_REALLY_IWMMXT)
> +#define ARM_HAVE_V4HI_ARITH (ARM_HAVE_NEON_V4HI_ARITH || 
> TARGET_REALLY_IWMMXT)
> +#define ARM_HAVE_V2SI_ARITH (ARM_HAVE_NEON_V2SI_ARITH || 
> TARGET_REALLY_IWMMXT)
> +
> +#define ARM_HAVE_V16QI_ARITH (ARM_HAVE_NEON_V16QI_ARITH || TARGET_HAVE_MVE)
> +#define ARM_HAVE_V8HI_ARITH (ARM_HAVE_NEON_V8HI_ARITH || TARGET_HAVE_MVE)
> +#define ARM_HAVE_V4SI_ARITH (ARM_HAVE_NEON_V4SI_ARITH || TARGET_HAVE_MVE)
> +#define ARM_HAVE_V2DI_ARITH ARM_HAVE_NEON_V2DI_ARITH
> +
> +#define ARM_HAVE_V4HF_ARITH ARM_HAVE_NEON_V4HF_ARITH
> +#define ARM_HAVE_V2SF_ARITH ARM_HAVE_NEON_V2SF_ARITH
> +
> +#define ARM_HAVE_V8HF_ARITH (ARM_HAVE_NEON_V8HF_ARITH || 
> TARGET_HAVE_MVE_FLOAT)
> +#define ARM_HAVE_V4SF_ARITH (ARM_HAVE_NEON_V4SF_ARITH || 
> TARGET_HAVE_MVE_FLOAT)
> +
>  /* The register numbers in sequence, for passing to arm_gen_load_multiple.  
> */
>  extern int arm_regs_in_sequence[];
>  
> diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> index 0bc9eba0722..c70e3bc2731 100644
> --- a/gcc/config/arm/iterators.md
> +++ b/gcc/config/arm/iterators.md
> @@ -66,14 +66,6 @@ (define_mode_iterator VSHFT [V4HI V2SI DI])
>  ;; Integer and float modes supported by Neon and IWMMXT.
>  (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
>  
> -;; Integer and float modes supported by Neon, IWMMXT and MVE, used by
> -;; arithmetic epxand patterns.
> -(define_mode_iterator VNIM [V16QI V8HI V4SI V4SF])
> -
> -;; Integer and float modes supported by Neon and IWMMXT but not MVE, used by
> -;; arithmetic epxand patterns.
> -(define_mode_iterator VNINOTM [V2SI V4HI V8QI V2SF V2DI])
> -
>  ;; Integer and float modes supported by Neon, IWMMXT and MVE.
>  (define_mode_iterator VNIM1 [V16QI V8HI V4SI V4SF V2DI])
>  
> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
> index 3e7b51d8ab6..96bf277f501 100644
> --- a/gcc/config/arm/neon.md
> +++ b/gcc/config/arm/neon.md
> @@ -501,7 +501,7 @@ (define_insn "*add<mode>3_neon"
>    [(set (match_operand:VDQ 0 "s_register_operand" "=w")
>          (plus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
>                 (match_operand:VDQ 2 "s_register_operand" "w")))]
> -  "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> +  "ARM_HAVE_NEON_<MODE>_ARITH"
>    "vadd.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
>    [(set (attr "type")
>        (if_then_else (match_test "<Is_float_mode>")
> @@ -509,49 +509,6 @@ (define_insn "*add<mode>3_neon"
>                      (const_string "neon_add<q>")))]
>  )
>  
> -;; As with SFmode, full support for HFmode vector arithmetic is only 
> available
> -;; when flag-unsafe-math-optimizations is enabled.
> -
> -;; Add pattern with modes V8HF and V4HF is split into separate patterns to 
> add
> -;; support for standard pattern addv8hf3 in MVE.  Following pattern is called
> -;; from "addv8hf3" standard pattern inside vec-common.md file.
> -
> -(define_insn "addv8hf3_neon"
> -  [(set
> -    (match_operand:V8HF 0 "s_register_operand" "=w")
> -    (plus:V8HF
> -     (match_operand:V8HF 1 "s_register_operand" "w")
> -     (match_operand:V8HF 2 "s_register_operand" "w")))]
> - "TARGET_NEON_FP16INST && flag_unsafe_math_optimizations"
> - "vadd.f16\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
> - [(set_attr "type" "neon_fp_addsub_s_q")]
> -)
> -
> -(define_insn "addv4hf3"
> -  [(set
> -    (match_operand:V4HF 0 "s_register_operand" "=w")
> -    (plus:V4HF
> -     (match_operand:V4HF 1 "s_register_operand" "w")
> -     (match_operand:V4HF 2 "s_register_operand" "w")))]
> - "TARGET_NEON_FP16INST && flag_unsafe_math_optimizations"
> - "vadd.f16\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
> - [(set_attr "type" "neon_fp_addsub_s_q")]
> -)
> -
> -(define_insn "add<mode>3_fp16"
> -  [(set
> -    (match_operand:VH 0 "s_register_operand" "=w")
> -    (plus:VH
> -     (match_operand:VH 1 "s_register_operand" "w")
> -     (match_operand:VH 2 "s_register_operand" "w")))]
> - "TARGET_NEON_FP16INST"
> - "vadd.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
> - [(set (attr "type")
> -   (if_then_else (match_test "<Is_float_mode>")
> -    (const_string "neon_fp_addsub_s<q>")
> -    (const_string "neon_add<q>")))]
> -)
> -
>  (define_insn "*sub<mode>3_neon"
>    [(set (match_operand:VDQ 0 "s_register_operand" "=w")
>          (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
> @@ -1837,7 +1794,7 @@ (define_expand "neon_vadd<mode>"
>     (match_operand:VH 2 "s_register_operand")]
>    "TARGET_NEON_FP16INST"
>  {
> -  emit_insn (gen_add<mode>3_fp16 (operands[0], operands[1], operands[2]));
> +  emit_insn (gen_add<mode>3 (operands[0], operands[1], operands[2]));
>    DONE;
>  })
>  
> diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
> index b7e3619caf4..c3c86c46355 100644
> --- a/gcc/config/arm/vec-common.md
> +++ b/gcc/config/arm/vec-common.md
> @@ -81,43 +81,11 @@ (define_expand "movv8hf"
>  ;; patterns separately for Neon, IWMMXT and MVE.
>  
>  (define_expand "add<mode>3"
> -  [(set (match_operand:VNIM 0 "s_register_operand")
> -     (plus:VNIM (match_operand:VNIM 1 "s_register_operand")
> -                (match_operand:VNIM 2 "s_register_operand")))]
> -  "(TARGET_NEON && ((<MODE>mode != V2SFmode && <MODE>mode != V4SFmode)
> -                 || flag_unsafe_math_optimizations))
> -   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))
> -   || (TARGET_HAVE_MVE && VALID_MVE_SI_MODE(<MODE>mode))
> -   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE(<MODE>mode))"
> -{
> -})
> -
> -;; Vector arithmetic.  Expanders are blank, then unnamed insns implement
> -;; patterns separately for Neon and MVE.
> -
> -(define_expand "addv8hf3"
> -  [(set (match_operand:V8HF 0 "s_register_operand")
> -     (plus:V8HF (match_operand:V8HF 1 "s_register_operand")
> -                (match_operand:V8HF 2 "s_register_operand")))]
> -  "(TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE(V8HFmode))
> -   || (TARGET_NEON_FP16INST && flag_unsafe_math_optimizations)"
> -{
> -  if (TARGET_NEON_FP16INST && flag_unsafe_math_optimizations)
> -    emit_insn (gen_addv8hf3_neon (operands[0], operands[1], operands[2]));
> -})
> -
> -;; Vector arithmetic.  Expanders are blank, then unnamed insns implement
> -;; patterns separately for Neon and IWMMXT.
> -
> -(define_expand "add<mode>3"
> -  [(set (match_operand:VNINOTM 0 "s_register_operand")
> -     (plus:VNINOTM (match_operand:VNINOTM 1 "s_register_operand")
> -                   (match_operand:VNINOTM 2 "s_register_operand")))]
> -  "(TARGET_NEON && ((<MODE>mode != V2SFmode && <MODE>mode != V4SFmode)
> -                 || flag_unsafe_math_optimizations))
> -   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
> -{
> -})
> +  [(set (match_operand:VDQ 0 "s_register_operand")
> +     (plus:VDQ (match_operand:VDQ 1 "s_register_operand")
> +               (match_operand:VDQ 2 "s_register_operand")))]
> +  "ARM_HAVE_<MODE>_ARITH"
> +)
>  
>  ;; Vector arithmetic. Expanders are blank, then unnamed insns implement
>  ;; patterns separately for IWMMXT and Neon.

Reply via email to