<saurabh....@arm.com> writes:
> The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and
> mandatory from Armv9.5-a. It introduces instructions for computing the
> floating point absolute maximum and minimum of the two vectors element-wise.
>
> This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of
> this extension are implemented as the following builtin functions:
> * vamax_f16
> * vamaxq_f16
> * vamax_f32
> * vamaxq_f32
> * vamaxq_f64
> * vamin_f16
> * vaminq_f16
> * vamin_f32
> * vaminq_f32
> * vaminq_f64
>
> We are defining a new way to add AArch64 AdvSIMD intrinsics by listing
> all the intrinsics in a .def file and then using that .def file to
> initialise various data structures. This would lead to more concise code
> and easier addition of the new AdvSIMD intrinsics in future.
>
> The faminmax intrinsics are defined using the new approach.
>
> gcc/ChangeLog:
>
>       * config/aarch64/aarch64-builtins.cc
>       (ENTRY): Macro to parse the contents of
>       aarch64-simd-pragma-builtins.def.
>       (enum aarch64_builtins): New enum values for faminmax builtins
>       via aarch64-simd-pragma-builtins.def.
>       (enum aarch64_builtin_signatures): Enum to specify the
>       number of operands a builtin will take.
>       (ENTRY_VHSDF): Macro to parse the contents of
>       aarch64-simd-pragma-builtins.def.
>       (struct aarch64_pragma_builtins_data): Struct to hold data from
>       aarch64-simd-pragma-builtins.def.
>       (aarch64_fntype): New function to define function types of
>       intrinsics given an object of type aarch64_pragma_builtins_data.
>       (aarch64_init_pragma_builtins): New function to define pragma
>       builtins.
>       (aarch64_get_pragma_builtin): New function to get a row of
>       aarch64_pragma_builtins, given code.
>       (handle_arm_neon_h): Modify to call
>       aarch64_init_pragma_builtins.
>       (aarch64_general_check_builtin_call): Modify to check whether
>       required flag is being used for pragma builtins.
>       (aarch64_expand_pragma_builtin): New function to emit
>       instructions of pragma_builtin.
>       (aarch64_general_expand_builtin): Modify to call
>       aarch64_expand_pragma_builtin.
>       * config/aarch64/aarch64-option-extensions.def
>       (AARCH64_OPT_EXTENSION): Introduce new flag for this extension.
>       * config/aarch64/aarch64-simd.md
>       (@aarch64_<faminmax_uns_op><mode>): Instruction pattern for
>       faminmax intrinsics.
>       * config/aarch64/aarch64.h
>       (TARGET_FAMINMAX): Introduce new flag for this extension.
>       * config/aarch64/iterators.md: New iterators and unspecs.
>       * config/arm/types.md: Introduce neon_fp_aminmax<q> attributes.
>       * doc/invoke.texi: Document extension in AArch64 Options.
>       * config/aarch64/aarch64-simd-pragma-builtins.def: New file to
>       list pragma builtins.
>
> gcc/testsuite/ChangeLog:
>
>       * gcc.target/aarch64/simd/faminmax-builtins-no-flag.c: New test.
>       * gcc.target/aarch64/simd/faminmax-builtins.c: New test.
> ---
>  gcc/config/aarch64/aarch64-builtins.cc        | 123 ++++++++++++++++++
>  .../aarch64/aarch64-option-extensions.def     |   2 +
>  .../aarch64/aarch64-simd-pragma-builtins.def  |  23 ++++
>  gcc/config/aarch64/aarch64-simd.md            |  11 ++
>  gcc/config/aarch64/aarch64.h                  |   4 +
>  gcc/config/aarch64/iterators.md               |   9 ++
>  gcc/config/arm/types.md                       |   5 +
>  gcc/doc/invoke.texi                           |   2 +
>  .../aarch64/simd/faminmax-builtins-no-flag.c  |  10 ++
>  .../aarch64/simd/faminmax-builtins.c          | 115 ++++++++++++++++
>  10 files changed, 304 insertions(+)
>  create mode 100644 gcc/config/aarch64/aarch64-simd-pragma-builtins.def
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index eb878b933fe..6e64ae86c52 100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -757,6 +757,18 @@ typedef struct
>  #define VAR1(T, N, MAP, FLAG, A) \
>    AARCH64_SIMD_BUILTIN_##T##_##N##A,
>  
> +#undef ENTRY
> +#define ENTRY(N, S, M, U, F) \
> +  AARCH64_##N,
> +
> +#undef ENTRY_VHSDF
> +#define ENTRY_VHSDF(NAME, SIGNATURE, UNSPEC, EXTENSIONS) \
> +  AARCH64_##NAME##_f16, \
> +  AARCH64_##NAME##q_f16, \
> +  AARCH64_##NAME##_f32, \
> +  AARCH64_##NAME##q_f32, \
> +  AARCH64_##NAME##q_f64,
> +
>  enum aarch64_builtins
>  {
>    AARCH64_BUILTIN_MIN,
> @@ -829,6 +841,10 @@ enum aarch64_builtins
>    AARCH64_RBIT,
>    AARCH64_RBITL,
>    AARCH64_RBITLL,
> +  /* Pragma builtins.  */
> +  AARCH64_PRAGMA_BUILTIN_START,
> +#include "aarch64-simd-pragma-builtins.def"
> +  AARCH64_PRAGMA_BUILTIN_END,
>    /* System register builtins.  */
>    AARCH64_RSR,
>    AARCH64_RSRP,
> @@ -947,6 +963,7 @@ const char *aarch64_scalar_builtin_types[] = {
>  
>  extern GTY(()) aarch64_simd_type_info aarch64_simd_types[];
>  
> +#undef ENTRY
>  #define ENTRY(E, M, Q, G)  \
>    {E, "__" #E, #G "__" #E, NULL_TREE, NULL_TREE, E_##M##mode, qualifier_##Q},
>  struct aarch64_simd_type_info aarch64_simd_types [] = {
> @@ -1547,6 +1564,80 @@ aarch64_init_simd_builtin_functions (bool 
> called_from_pragma)
>      }
>  }
>  
> +enum aarch64_builtin_signatures
> +{
> +  AARCH64_BUILTIN_SIGNATURE_MIN,
> +  binary,
> +  AARCH64_BUILTIN_SIGNATURE_MAX,
> +};

"binary" on its own probably isn't unique enough, and similar names
could clash with target-independent code.  One way of avoiding that
would be to use:

  AARCH64_BUILTIN_SIGNATURE_binary

and use token pasting in ENTRY to add the "AARCH64_BUILTIN_SIGNATURE_"
prefix.  But I guess a more C++ way would be to use enum classes:

enum class aarch64_builtin_signature
{
  binary,
};

which is probably a bit nicer.  (I don't think we need the min and
max values.)  Then:

> +
> +#undef ENTRY
> +#define ENTRY(N, S, M, U, F) \
> +  {#N, S, E_##M##mode, U, F},

...this would be:

  {#N, aarch64_builtin_signature::S, E_##M##mode, U, F},

> +
> +#undef ENTRY_VHSDF
> +#define ENTRY_VHSDF(NAME, SIGNATURE, UNSPEC, EXTENSIONS) \
> +  ENTRY (NAME##_f16, SIGNATURE, V4HF, UNSPEC, EXTENSIONS) \
> +  ENTRY (NAME##q_f16, SIGNATURE, V8HF, UNSPEC, EXTENSIONS) \
> +  ENTRY (NAME##_f32, SIGNATURE, V2SF, UNSPEC, EXTENSIONS) \
> +  ENTRY (NAME##q_f32, SIGNATURE, V4SF, UNSPEC, EXTENSIONS) \
> +  ENTRY (NAME##q_f64, SIGNATURE, V2DF, UNSPEC, EXTENSIONS)
> +
> +/* Initialize pragma builtins.  */
> +
> +struct aarch64_pragma_builtins_data
> +{
> +  const char *name;
> +  aarch64_builtin_signatures signature;
> +  machine_mode mode;
> +  int unspec;
> +  aarch64_feature_flags required_extensions;
> +};
> +
> +static aarch64_pragma_builtins_data aarch64_pragma_builtins[] = {
> +#include "aarch64-simd-pragma-builtins.def"
> +};
> +
> +static tree
> +aarch64_fntype (const aarch64_pragma_builtins_data &builtin_data)
> +{
> +  auto type = aarch64_simd_builtin_type (builtin_data.mode, qualifier_none);
> +  switch (builtin_data.signature)
> +    {
> +    case binary:

And similarly here.

> +      return build_function_type_list (type, type, type, NULL_TREE);
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +
> [...]
> @@ -3189,6 +3287,25 @@ aarch64_expand_builtin_data_intrinsic (unsigned int 
> fcode, tree exp, rtx target)
>    return ops[0].value;
>  }
>  
> +static rtx
> +aarch64_expand_pragma_builtin (tree exp, rtx target,
> +                            const aarch64_pragma_builtins_data* builtin_data)

Formatting, should be:

                               const aarch64_pragma_builtins_data *builtin_data)

with the "*" after the space.

> +{
> +  expand_operand ops[3];
> +  auto mode = builtin_data->mode;
> +  auto op1 = expand_normal (CALL_EXPR_ARG (exp, 0));
> +  auto op2 = expand_normal (CALL_EXPR_ARG (exp, 1));
> +  create_output_operand (&ops[0], target, mode);
> +  create_input_operand (&ops[1], op1, mode);
> +  create_input_operand (&ops[2], op2, mode);
> +
> +  auto unspec = builtin_data->unspec;
> +  auto icode = code_for_aarch64 (unspec, mode);
> +  expand_insn (icode, 3, ops);
> +
> +  return target;
> +}
> +
>  /* Expand an expression EXP as fpsr or fpcr setter (depending on
>     UNSPEC) using MODE.  */
>  static void
> @@ -3369,6 +3486,11 @@ aarch64_general_expand_builtin (unsigned int fcode, 
> tree exp, rtx target,
>        && fcode <= AARCH64_RBITLL)
>      return aarch64_expand_builtin_data_intrinsic (fcode, exp, target);
>  
> +  if (auto builtin_data = aarch64_get_pragma_builtin (fcode))
> +    {
> +      return aarch64_expand_pragma_builtin (exp, target, builtin_data);
> +    }
> +

Formatting nit, should be no braces for single statements:

  if (auto builtin_data = aarch64_get_pragma_builtin (fcode))
    return aarch64_expand_pragma_builtin (exp, target, builtin_data);


>    gcc_unreachable ();
>  }
>  
> [...]
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 23c03a96371..7542c81ed91 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -9910,3 +9910,14 @@
>    "shl\\t%d0, %d1, #16"
>    [(set_attr "type" "neon_shift_imm")]
>  )
> +
> +;; faminmax
> +(define_insn "@aarch64_<faminmax_uns_op><mode>"
> +  [(set (match_operand:VHSDF 0 "register_operand" "=w")
> +     (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")
> +                    (match_operand:VHSDF 2 "register_operand" "w")]
> +                   FAMINMAX_UNS))]
> +  "TARGET_FAMINMAX"
> +  "<faminmax_uns_op>\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
> +  [(set_attr "type" "neon_fp_aminmax<q>")]
> +)

Sorry for not noticing this last time, but: rather than add a new
scheduling type for these instructions, let's just assume for now
that they have the same performance characteristics as fmax and fmin.
We can tweak it later if a scheduling model needs to distinguish them.

Reusing an existing type is better because existing scheduling models
will recognise it as a known operation, rather than treating it as an
unknown operation (which is usually done conservatively).

Not that it matters for modern OoO cores, but still. :)

> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 2dfb999bea5..de14f57071a 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -457,6 +457,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE 
> ATTRIBUTE_UNUSED
>      enabled through +gcs.  */
>  #define TARGET_GCS AARCH64_HAVE_ISA (GCS)
>  
> +/*  Floating Point Absolute Maximum/Minimum extension instructions are
> +    enabled through +faminmax.  */

I realise this happens in pre-existing code too, but: there should just be
one space after "/*":

/* Floating Point Absolute Maximum/Minimum extension instructions are
   enabled through +faminmax.  */

Thanks,
Richard

> +#define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX)
> +
> [...]

Reply via email to