Re: [PATCH] aarch64: Add support for AdvSIMD faminmax

Kyrylo Tkachov Fri, 02 Aug 2024 01:11:25 -0700

Hi Saurabh,

> On 1 Aug 2024, at 11:14, Saurabh Jha wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and
> mandatory from Armv9.5-a. It introduces instructions for computing the
> floating point absolute maximum and minimum of the two vectors element-wise.
> 
> This patch does three things:
> 1. Introduces AdvSIMD faminmax intrinsics.
> 2. Adds code generation support for famax and famin in terms of the
>   existing operators.
> 3. Move report_missing_extension and reported_missing_extension_p to
>   make it more usable.
> 
> The intrinsics of this extension are implemented as the following
> builtin functions:
> * vamax_f16
> * vamaxq_f16
> * vamax_f32
> * vamaxq_f32
> * vamaxq_f64
> * vamin_f16
> * vaminq_f16
> * vamin_f32
> * vaminq_f32
> * vaminq_f64
> 
> For code generation, famax/famin is equivalent to first taking fabs of
> the operands and then taking fmax/fmin of the results of fabs.
>        famax/famin (a, b) = fmax/fmin (fabs (a), fabs (b))
> This is correct because NaN/Inf handling of famax/famin and fmax/fmin
> are same. We cannot use fmaxnm/fminnm here as Nan/Inf are handled
> differently in them.
>


Thank you for the explanation.


> We moved the definition of `report_missing_extension` from
> gcc/config/aarch64/aarch64-sve-builtins.cc to
> gcc/config/aarch64/aarch64-builtins.cc and its declaration to
> gcc/config/aarch64/aarch64-builtins.h. We also moved the declaration
> of `reported_missing_extension_p` from
> gcc/config/aarch64/aarch64-sve-builtins.cc
> to gcc/config/aarch64/aarch64-builtins.cc, closer to the definition of
> `report_missing_extension`. In the exsiting code structure, this leads
> to `report_missing_extension` being usable from both normal builtins
> and sve builtins.

Thanks, this looks much better.


> 
> gcc/ChangeLog:
> 
>        * config/aarch64/aarch64-builtins.cc
>        (enum aarch64_builtins): New enum values for faminmax builtins.
>        (aarch64_init_faminmax_builtins): New function to declare new builtins.
>        (handle_arm_neon_h): Modify to call aarch64_init_faminmax_builtins.
>        (aarch64_general_check_builtin_call): Modify to check whether 
> +faminmax flag is being used and printing error message if not being used.
>        (aarch64_expand_builtin_faminmax): New function to emit instructions 
> of this extension.
>        (aarch64_general_expand_builtin): Modify to call 
> aarch64_expand_builtin_faminmax.
>        (report_missing_extension): Move from 
> config/aarch64/aarch64-sve-builtins.cc.
>        * config/aarch64/aarch64-builtins.h
>        (report_missing_extension): Declaration for this function so
> that it can be used wherever this header is included.
>        (reported_missing_extension_p): Move from
> config/aarch64/aarch64-sve-builtins.cc
>        * config/aarch64/aarch64-option-extensions.def
>        (AARCH64_OPT_EXTENSION): Introduce new flag for this extension.
>        * config/aarch64/aarch64-simd.md
>        (aarch64_<faminmax><mode>): Introduce instruction pattern for this 
> extension.
>        * config/aarch64/aarch64-sve-builtins.cc
>        (reported_missing_extension_p): Move to 
> config/aarch64/aarch64-builtins.cc
>        (report_missing_extension): Move to
> config/aarch64/aarch64-builtins.cc.
>        * config/aarch64/aarch64.h
>        (TARGET_FAMINMAX): Introduce new flag for this extension.
>        * config/aarch64/iterators.md: Introduce new iterators for this 
> extension.
>        * config/arm/types.md: Introduce neon_fp_aminmax<q> attributes.
>        * doc/invoke.texi: Document extension in AArch64 Options.
> 
> gcc/testsuite/ChangeLog:
> 
>        * gcc.target/aarch64/simd/faminmax-builtins-no-flag.c: New test.
>        * gcc.target/aarch64/simd/faminmax-builtins.c: New test.
>        * gcc.target/aarch64/simd/faminmax-codegen-no-flag.c: New test.
>        * gcc.target/aarch64/simd/faminmax-codegen.c: New test.

The aarch64 backend parts look good, but I think the tests could be improved. 
Sorry for missing it out in the first review.


> 
> ---
> Hi,
> 
> Regression tested for aarch64-none-linux-gnu and found no regressions.
> 
> This patch is a revised version of an earlier patch
> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657914.html but has
> more scope than that. That's why I didn't add "v2" in the subject line.
> 
> Ok for master? I don't have commit access so can someone please commit
> on my behalf?
> 
> Regards,
> Saurabh
> ---
> gcc/config/aarch64/aarch64-builtins.cc        | 173 +++++++++++++++++-
> gcc/config/aarch64/aarch64-builtins.h         |   5 +-
> .../aarch64/aarch64-option-extensions.def     |   2 +
> gcc/config/aarch64/aarch64-simd.md            |  12 ++
> gcc/config/aarch64/aarch64-sve-builtins.cc    |  22 ---
> gcc/config/aarch64/aarch64.h                  |   4 +
> gcc/config/aarch64/iterators.md               |   8 +
> gcc/config/arm/types.md                       |   6 +
> gcc/doc/invoke.texi                           |   2 +
> .../aarch64/simd/faminmax-builtins-no-flag.c  |  10 +
> .../aarch64/simd/faminmax-builtins.c          |  75 ++++++++
> .../aarch64/simd/faminmax-codegen-no-flag.c   |  54 ++++++
> .../aarch64/simd/faminmax-codegen.c           | 104 +++++++++++
> 13 files changed, 445 insertions(+), 32 deletions(-)
> create mode 100644 
> gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c
> create mode 100644 
> gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c
> 


diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c 
b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c
new file mode 100644
index 00000000000..63ed1508c23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c
@@ -0,0 +1,10 @@
+/* { dg-do assemble} */
+/* { dg-additional-options "-march=armv9-a" } */
+
+#include "arm_neon.h"
+
+void
+test (float32x4_t a, float32x4_t b)
+{
+  vamaxq_f32 (a, b); /* { dg-error {ACLE function 'vamaxq_f32' requires ISA 
extension 'faminmax'} } */
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c 
b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c
new file mode 100644
index 00000000000..f2b5bafb81c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c
@@ -0,0 +1,75 @@
+/* { dg-do assemble} */
+/* { dg-additional-options "-march=armv9-a+faminmax" } */
+
+#include "arm_neon.h"
+
+float16x4_t
+test_vamax_f16 (float16x4_t a, float16x4_t b)
+{
+  return vamax_f16 (a, b);
+}
+
+float16x8_t
+test_vamaxq_f16 (float16x8_t a, float16x8_t b)
+{
+  return vamaxq_f16 (a, b);
+}
+
+float32x2_t
+test_vamax_f32 (float32x2_t a, float32x2_t b)
+{
+  return vamax_f32 (a, b);
+}
+
+float32x4_t
+test_vamaxq_f32 (float32x4_t a, float32x4_t b)
+{
+  return vamaxq_f32 (a, b);
+}
+
+float64x2_t
+test_vamaxq_f64 (float64x2_t a, float64x2_t b)
+{
+  return vamaxq_f64 (a, b);
+}
+
+float16x4_t
+test_vamin_f16 (float16x4_t a, float16x4_t b)
+{
+  return vamin_f16 (a, b);
+}
+
+float16x8_t
+test_vaminq_f16 (float16x8_t a, float16x8_t b)
+{
+  return vaminq_f16 (a, b);
+}
+
+float32x2_t
+test_vamin_f32 (float32x2_t a, float32x2_t b)
+{
+  return vamin_f32 (a, b);
+}
+
+float32x4_t
+test_vaminq_f32 (float32x4_t a, float32x4_t b)
+{
+  return vaminq_f32 (a, b);
+}
+
+float64x2_t
+test_vaminq_f64 (float64x2_t a, float64x2_t b)
+{
+  return vaminq_f64 (a, b);
+}
+
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.4h, v[0-9]+.4h, 
v[0-9]+.4h} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.8h, v[0-9]+.8h, 
v[0-9]+.8h} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.2s, v[0-9]+.2s, 
v[0-9]+.2s} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.4s, v[0-9]+.4s, 
v[0-9]+.4s} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.2d, v[0-9]+.2d, 
v[0-9]+.2d} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamin\tv[0-9]+.4h, v[0-9]+.4h, 
v[0-9]+.4h} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamin\tv[0-9]+.8h, v[0-9]+.8h, 
v[0-9]+.8h} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamin\tv[0-9]+.2s, v[0-9]+.2s, 
v[0-9]+.2s} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamin\tv[0-9]+.4s, v[0-9]+.4s, 
v[0-9]+.4s} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamin\tv[0-9]+.2d, v[0-9]+.2d, 
v[0-9]+.2d} 1 } } */

We should use the check-function-bodies infrastructure to match the exact 
assembly for each of the test functions. You can grep around the 
gcc.target/aarch64 directory for examples of how to use that.

It would also be good to have some tests for generating these instructions from 
straight C code using __builtin_fmax/fmin and __builtin_fabs to make sure we’re 
not missing out on general codegen improvements from this work.

Thanks,
Kyrill

Re: [PATCH] aarch64: Add support for AdvSIMD faminmax

Reply via email to