Re: [PATCH v1] aarch64: Add fp8 scalar types

Claudio Bantaloukas Fri, 02 Aug 2024 09:58:37 -0700


On 02/08/2024 12:17, Richard Sandiford wrote:
> Claudio Bantaloukas <claudio.bantalou...@arm.com> writes:
>> The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that
>> can only be used by fp8 intrinsics. Additionally, the mfloat8_t type is made
>> available in arm_neon.h and arm_sve.h as an alias of the same.
>>
>> This implementation uses an INTEGER_TYPE, with precision 8 to represent
>> __mfp8. Conversions to int and other types are disabled via the
>> TARGET_INVALID_CONVERSION hook.
>> Additionally, operations that are typically available to integer types are
>> disabled via TARGET_INVALID_UNARY_OP and TARGET_INVALID_BINARY_OP hooks.
>>
>> gcc/ChangeLog:
>>
>>      * config/aarch64/aarch64-builtins.cc (aarch64_mfp8_type_node): Add node
>>      for __mfp8 type.
>>      (aarch64_mfp8_ptr_type_node): Add node for __mfp8 pointer type.
>>      (aarch64_init_fp8_types): New function to initialise fp8 types and
>>      register with language backends.
>>      * config/aarch64/aarch64.cc (aarch64_invalid_conversion): Add function
>>      implementing TARGET_INVALID_CONVERSION hook that blocks conversion to
>>      and from __mfp8 type.
>>      (aarch64_invalid_unary_op): Add function implementing TARGET_UNARY_OP
>>      hook that blocks operations on __mfp8 other than &.
>>      (aarch64_invalid_binary_op): Extend TARGET_BINARY_OP hook to disallow
>>      operations on __mfp8 type.
>>      (TARGET_INVALID_CONVERSION): Add define.
>>      (TARGET_INVALID_UNARY_OP): Likewise.
>>      * config/aarch64/aarch64.h (aarch64_mfp8_type_node): Add node for __mfp8
>>      type.
>>      (aarch64_mfp8_ptr_type_node): Add node for __mfp8 pointer type.
>>      * config/aarch64/arm_neon.h (mfloat8_t): Add typedef.
>>      * config/aarch64/arm_sve.h (mfloat8_t): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * gcc.target/aarch64/fp8_scalar_1.c: New tests in C.
>>      * gcc.target/aarch64/fp8_scalar_typecheck_1.c: Likewise.
>>      * gcc.target/aarch64/fp8_scalar_typecheck_2.C: New tests in C++.
>


Hi Richard,
thank you for the super fast review!

> C++ tests should go in g++.target instead.

Done

> I think the new type needs to be mangled explicitly, so that the
> overloads in:
> 
>    int foo(__mfp8) { return 1; }
>    int foo(unsigned char) { return 2; }
>    int bar(__mfp8 x) { return foo(x); }
> 
> are distinct.  It'd also be good to have a constexpr version of foo
> in the tests, to make sure that the right overload is chosen.

Added both regular and constexpr overloading checks.

>>
>> +static void
>> +aarch64_init_fp8_types (void)
> 
> The function should have a comment before it.
> 
Added

>> +{
>> +  aarch64_mfp8_type_node = make_node (INTEGER_TYPE);
>> +  TYPE_PRECISION (aarch64_mfp8_type_node) = 8;
>> +  TYPE_MIN_VALUE (aarch64_mfp8_type_node)
>> +      = TYPE_MIN_VALUE (unsigned_char_type_node);
>> +  TYPE_MAX_VALUE (aarch64_mfp8_type_node)
>> +      = TYPE_MAX_VALUE (unsigned_char_type_node);
> 
> If we're using the unsigned range, we should also set TYPE_UNSIGNED.
> That said...
> 
>> +  layout_type (aarch64_mfp8_type_node);
> 
> ...it looks like the code above could be replaced by:
> 
>    aarch64_mfp8_type_node = make_unsigned_type (8);
> 
> which would also give TYPE_MIN_VALUE and TYPE_MAX_VALUE the "right" types.

Done, this has reduced the function considerably, thanks!

> I was surprised that the tests worked so well with just a standard
> integer type, without having to use build_distinct_type_copy.
> But since they do, I agree we shouldn't use build_distinct_type_copy
> unless a specific reason comes up.
> 

Haven't found a specific reason to up to now.

>>   
>> +/* Implement TARGET_INVALID_CONVERSION.
>> +
>> +Return the diagnostic message when it is invalid to convert from fromtype to
>> +totype, or NULL if validity should be determined by the front end. */
> 
> The usual style is not to reiterate the hook description, since when
> that's happened in the past, the comments have become out of date
> wrt the documentation.  So just:
> 
> /* Implement TARGET_INVALID_CONVERSION.  */
> 
> should be good enough.

Done

>> +
>> +static const char *
>> +aarch64_invalid_conversion (const_tree fromtype, const_tree totype)
>> +{
>> +  /* Do not allow conversions to/from FP8.  */
>> +  bool fromtype_is_fp8
>> +      = ((fromtype) && (TYPE_MODE (fromtype) == QImode)
>> +     && (TYPE_MAIN_VARIANT (fromtype) == aarch64_mfp8_type_node));
>> +  bool totype_is_fp8
>> +      = ((totype)
>> +     && (TYPE_MODE (totype) == QImode
>> +         && TYPE_MAIN_VARIANT (totype) == aarch64_mfp8_type_node));
> 
> Did you see null fromtypes and totypes?  It doesn't look like the other
> targets have needed to handle them, and it's not clear what the correct
> behaviour would be in that case.
> 
> The QImode tests also look redundant.
> 
> Trying it locally, things seemed to work for me with:
> 
>    bool fromtype_is_fp8
>      = TYPE_MAIN_VARIANT (fromtype) == aarch64_mfp8_type_node;
>    bool totype_is_fp8
>      = TYPE_MAIN_VARIANT (totype) == aarch64_mfp8_type_node;

A previous version of this patch was not setting the minval and maxval 
of the type. This was causing narrowing tests to fail in C++ and the 
conversion checking would be called with a null fromtype with this code, 
causing an ICE

__mfp8 global_fp8{};

With minval and maxval properly set, the ICEs no longer occur. I'm 
simplifying the checks as you propose.

>>   
>> +/* Implement TARGET_INVALID_UNARY_OP.
>> +
>> +   Return the diagnostic message string if the unary operation OP is
>> +   not permitted on TYPE, NULL otherwise.  */
> 
> Similar comment about the comment here.

Done

> 
>> +
>> +static const char *
>> +aarch64_invalid_unary_op (int op, const_tree type)
>> +{
>> +  /* Reject all single-operand operations on __mfp8 except for &.  */
>> +  if ((TYPE_MODE (type) == QImode)
>> +      && (TYPE_MAIN_VARIANT (type) == aarch64_mfp8_type_node)
>> +      && (op != ADDR_EXPR))
> 
> Just:
> 
>    if (TYPE_MAIN_VARIANT (type) == aarch64_mfp8_type_node
>        && op != ADDR_EXPR)
> 
> should be enough.  (GCC style is not to add brackets around individual
> comparisons, unless they're needed or span multiple lines.)

Done

>> +    return N_ ("operation not permitted on type %<mfloat8_t%>");
>> +
>> +  /* Operation allowed.  */
>> +  return NULL;
>> +}
>> +
>>   /* Return the diagnostic message string if the binary operation OP is
>>      not permitted on TYPE1 and TYPE2, NULL otherwise.  */
>>   
>> @@ -28982,6 +29029,13 @@ aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, 
>> const_tree type1,
>>        != aarch64_sve::builtin_type_p (type2)))
>>       return N_("cannot combine GNU and SVE vectors in a binary operation");
>>   
>> +  /* Reject all 2-operand operations on __mfp8.  */
>> +  if (((TYPE_MODE (type1) == QImode)
>> +       && (TYPE_MAIN_VARIANT (type1) == aarch64_mfp8_type_node))
>> +      || ((TYPE_MODE (type2) == QImode)
>> +          && (TYPE_MAIN_VARIANT (type2) == aarch64_mfp8_type_node)))
> 
> Similarly here.

Done

> 
>> +    return N_ ("operation not permitted on type %<mfloat8_t%>");
>> +
>>     /* Operation allowed.  */
>>     return NULL;
>>   }
>> [...]
>> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
>> index e376685489d..0092314cf75 100644
>> --- a/gcc/config/aarch64/arm_neon.h
>> +++ b/gcc/config/aarch64/arm_neon.h
>> @@ -72,6 +72,8 @@ typedef __Poly16_t poly16_t;
>>   typedef __Poly64_t poly64_t;
>>   typedef __Poly128_t poly128_t;
>>   
>> +typedef __mfp8 mfloat8_t;
>> +
>>   typedef __fp16 float16_t;
>>   typedef float float32_t;
>>   typedef double float64_t;
>> diff --git a/gcc/config/aarch64/arm_sve.h b/gcc/config/aarch64/arm_sve.h
>> index aa0bd9909f9..61c440fef56 100644
>> --- a/gcc/config/aarch64/arm_sve.h
>> +++ b/gcc/config/aarch64/arm_sve.h
>> @@ -29,6 +29,7 @@
>>   #include <arm_private_fp8.h>
>>   #include <arm_bf16.h>
>>   
>> +typedef __mfp8 mfloat8_t;
>>   typedef __fp16 float16_t;
>>   typedef float float32_t;
>>   typedef double float64_t;
> 
> Ultran minor nit, but it'd be good to have a consistent separation style
> between arm_neon.h and arm_sve.h.  (Don't mind which we use.)

Separated __mfp8 in both files.

>> diff --git a/gcc/testsuite/gcc.target/aarch64/fp8_scalar_1.c 
>> b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_1.c
>> new file mode 100644
>> index 00000000000..6925653e33c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_1.c
>> @@ -0,0 +1,108 @@
>> +/* Test the fp8 ACLE intrinsics family.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -march=armv9.4-a+fp8" } */
>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>> +
>> +#include <arm_neon.h>
>> +
>> +/*
>> +**stacktest1:
>> +**  sub     sp, sp, #16
>> +**  sxtb    w0, w0
> 
> (This would become an AND if we do switch to an unsigned type.)
Yep!

>> +**  strb    w0, \[sp, 15\]
>> +**  ldrb    w0, \[sp, 15\]
>> +**  add     sp, sp, 16
>> +**  ret
>> +*/
>> +mfloat8_t
>> +stacktest1 (mfloat8_t __a)
>> +{
>> +  volatile mfloat8_t b = __a;
>> +  return b;
>> +}
>> +
>> +/*
>> +**fp8_mov_ww:
>> +**  dup     b1, v2.b\[0\]
>> +**  ret
>> +*/
>> +void
>> +fp8_mov_ww (void)
>> +{
>> +  register mfloat8_t x asm ("h2");
>> +  register mfloat8_t y asm ("h1");
>> +  asm volatile ("" : "=w"(x));
>> +  y = x;
>> +  asm volatile ("" ::"w"(y));
>> +}
>> +
>> +/*
>> +**fp8_mov_rw:
>> +**  dup     v1.8b, w1
>> +**  ret
>> +*/
>> +void
>> +fp8_mov_rw (void)
>> +{
>> +  register mfloat8_t x asm ("w1");
>> +  register mfloat8_t y asm ("h1");
>> +  asm volatile ("" : "=r"(x));
>> +  y = x;
>> +  asm volatile ("" ::"w"(y));
>> +}
>> +
>> +/*
>> +**fp8_mov_wr:
>> +**  umov    w1, v1.b\[0\]
>> +**  ret
>> +*/
>> +void
>> +fp8_mov_wr (void)
>> +{
>> +  register mfloat8_t x asm ("h1");
>> +  register mfloat8_t y asm ("w1");
>> +  asm volatile ("" : "=w"(x));
>> +  y = x;
>> +  asm volatile ("" ::"r"(y));
>> +}
>> +
>> +/*
>> +**fp8_mov_rr:
>> +**  mov     w1, w2
>> +**  ret
>> +*/
>> +void
>> +fp8_mov_rr (void)
>> +{
>> +  register mfloat8_t x asm ("w2");
>> +  register mfloat8_t y asm ("w1");
>> +  asm volatile ("" : "=r"(x));
>> +  y = x;
>> +  asm volatile ("" ::"r"(y));
>> +}
>> +
>> +/*
>> +**fp8_mov_rm:
>> +**  strb    w2, \[x0\]
>> +**  ret
>> +*/
>> +void
>> +fp8_mov_rm (mfloat8_t *ptr)
>> +{
>> +  register mfloat8_t x asm ("w2");
>> +  asm volatile ("" : "=r"(x));
>> +  *ptr = x;
>> +}
>> +
>> +/*
>> +**fp8_mov_mr:
>> +**  ldrb    w2, \[x0\]
>> +**  ret
>> +*/
>> +void
>> +fp8_mov_mr (mfloat8_t *ptr)
>> +{
>> +  register mfloat8_t y asm ("w2");
>> +  y = *ptr;
>> +  asm volatile ("" ::"r"(y));
>> +}
> 
> Nice tests :-)

Yay bfloat types!

> It would be good to test loads and stores for FPRs as well,
> for completeness.

> 
>> diff --git a/gcc/testsuite/gcc.target/aarch64/fp8_scalar_typecheck_1.c 
>> b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_typecheck_1.c
>> new file mode 100644
>> index 00000000000..122dc5aa2b5
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_typecheck_1.c
>> @@ -0,0 +1,329 @@
>> +/* Test that there is no conversion between ints and mfloat8_t.  */
>> +/* { dg-do assemble } */
>> +/* { dg-options "-O1 -march=armv9.4-a+fp8" } */
>> +
>> +#include <arm_neon.h>
>> +#include <stdint.h>
>> +
>> +mfloat8_t glob_fp8;
>> +
>> +int is_an_int;
>> +uint8_t is_a_uint8;
>> +int8_t is_an_int8;
>> +short is_a_short_int;
>> +float is_a_float;
>> +double is_a_double;
>> +
>> +uint8_t *uint8_ptr;
> 
> I think we should also test that:
> 
>    mfloat8_t extern_fn1(void);
>    unsigned char extern_fn1(void);
> 
>    mfloat8_t extern_fn2(void);
>    uint8_t extern_fn2(void);
> 
>    unsigned char extern_fn3(void);
>    mfloat8_t extern_fn3(void);
> 
>    uint8_t extern_fn4(void);
>    mfloat8_t extern_fn4(void);

Done, also adding a check using mfloat8_t and uint8_t in the argument.

> give errors on the second definition.  Same for C++.

Done, minus the argument overloading.

>> +  /* Unary operators.  */
>> +
>> +  +scalar0;   /* { dg-error {operation not permitted on type 'mfloat8_t'} } 
>> */
>> +  -scalar0;   /* { dg-error {operation not permitted on type 'mfloat8_t'} } 
>> */
>> +  ~scalar0;   /* { dg-error {operation not permitted on type 'mfloat8_t'} } 
>> */
>> +  !scalar0;   /* { dg-error {operation not permitted on type 'mfloat8_t'} } 
>> */
>> +  *scalar0;   /* { dg-error {invalid conversion from type 'mfloat8_t'} } */
>> +  __real scalar0; /* { dg-error {operation not permitted on type 
>> 'mfloat8_t'} } */
>> +  __imag scalar0; /* { dg-error {operation not permitted on type 
>> 'mfloat8_t'} } */
>> +  ++scalar0;          /* { dg-error {operation not permitted on type 
>> 'mfloat8_t'} } */
>> +  --scalar0;          /* { dg-error {operation not permitted on type 
>> 'mfloat8_t'} } */
>> +  scalar0++;          /* { dg-error {operation not permitted on type 
>> 'mfloat8_t'} } */
>> +  scalar0--;          /* { dg-error {operation not permitted on type 
>> 'mfloat8_t'} } */
>> +
>> +  /* Binary arithmetic operations.  */
>> +
>> +  scalar0 = glob_fp8 + scalar1_2; /* { dg-error {invalid conversion from 
>> type 'mfloat8_t'} } */
>> +  scalar0 = glob_fp8 + *fp8_ptr; /* { dg-error {invalid conversion from 
>> type 'mfloat8_t'} } */
>> +  scalar0 = glob_fp8
>> +        + 0.1; /* { dg-error {invalid conversion from type 'mfloat8_t'} } */
>> +  scalar0
>> +      = glob_fp8 + 0; /* { dg-error {invalid conversion from type 
>> 'mfloat8_t'} } */
>> +  scalar0
>> +      = glob_fp8
>> +    + is_a_float; /* { dg-error {invalid conversion from type 'mfloat8_t'} 
>> } */
> 
> Hmm, so all the errors for binary operations come from the usual
> integer conversions, rather than from the binary operation itself.
> That makes sense.
> 
> Could you also test fp8 op fp8 only for the other binary operations,
> including && and ||?

Done, all fail on invalid conversion from type 'mfloat8_t'

> 
> For C++, it would be good to test:
> 
>    #include <type_traits>
> 
>    static_assert(!std::is_integral<__mfp8>(), "not integral");
>    static_assert(!std::is_signed<__mfp8>(), "not signed");
>    static_assert(!std::is_unsigned<__mfp8>(), "not unsigned");
> 
> to make sure that we maintain the abstraction.  Was kind-of surprised
> that this Just Works -- didn't look into the mechanics of how it does.

Done

> Thanks,
> Richard

Thank you!
I'll ask Andrew to check and post the updated version for me as I'll be 
on holidays.

Cheers,
Claudio

Re: [PATCH v1] aarch64: Add fp8 scalar types

Reply via email to