On 02/08/2024 12:17, Richard Sandiford wrote: > Claudio Bantaloukas <claudio.bantalou...@arm.com> writes: >> The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that >> can only be used by fp8 intrinsics. Additionally, the mfloat8_t type is made >> available in arm_neon.h and arm_sve.h as an alias of the same. >> >> This implementation uses an INTEGER_TYPE, with precision 8 to represent >> __mfp8. Conversions to int and other types are disabled via the >> TARGET_INVALID_CONVERSION hook. >> Additionally, operations that are typically available to integer types are >> disabled via TARGET_INVALID_UNARY_OP and TARGET_INVALID_BINARY_OP hooks. >> >> gcc/ChangeLog: >> >> * config/aarch64/aarch64-builtins.cc (aarch64_mfp8_type_node): Add node >> for __mfp8 type. >> (aarch64_mfp8_ptr_type_node): Add node for __mfp8 pointer type. >> (aarch64_init_fp8_types): New function to initialise fp8 types and >> register with language backends. >> * config/aarch64/aarch64.cc (aarch64_invalid_conversion): Add function >> implementing TARGET_INVALID_CONVERSION hook that blocks conversion to >> and from __mfp8 type. >> (aarch64_invalid_unary_op): Add function implementing TARGET_UNARY_OP >> hook that blocks operations on __mfp8 other than &. >> (aarch64_invalid_binary_op): Extend TARGET_BINARY_OP hook to disallow >> operations on __mfp8 type. >> (TARGET_INVALID_CONVERSION): Add define. >> (TARGET_INVALID_UNARY_OP): Likewise. >> * config/aarch64/aarch64.h (aarch64_mfp8_type_node): Add node for __mfp8 >> type. >> (aarch64_mfp8_ptr_type_node): Add node for __mfp8 pointer type. >> * config/aarch64/arm_neon.h (mfloat8_t): Add typedef. >> * config/aarch64/arm_sve.h (mfloat8_t): Likewise. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/aarch64/fp8_scalar_1.c: New tests in C. >> * gcc.target/aarch64/fp8_scalar_typecheck_1.c: Likewise. >> * gcc.target/aarch64/fp8_scalar_typecheck_2.C: New tests in C++. >
Hi Richard, thank you for the super fast review! > C++ tests should go in g++.target instead. Done > I think the new type needs to be mangled explicitly, so that the > overloads in: > > int foo(__mfp8) { return 1; } > int foo(unsigned char) { return 2; } > int bar(__mfp8 x) { return foo(x); } > > are distinct. It'd also be good to have a constexpr version of foo > in the tests, to make sure that the right overload is chosen. Added both regular and constexpr overloading checks. >> >> +static void >> +aarch64_init_fp8_types (void) > > The function should have a comment before it. > Added >> +{ >> + aarch64_mfp8_type_node = make_node (INTEGER_TYPE); >> + TYPE_PRECISION (aarch64_mfp8_type_node) = 8; >> + TYPE_MIN_VALUE (aarch64_mfp8_type_node) >> + = TYPE_MIN_VALUE (unsigned_char_type_node); >> + TYPE_MAX_VALUE (aarch64_mfp8_type_node) >> + = TYPE_MAX_VALUE (unsigned_char_type_node); > > If we're using the unsigned range, we should also set TYPE_UNSIGNED. > That said... > >> + layout_type (aarch64_mfp8_type_node); > > ...it looks like the code above could be replaced by: > > aarch64_mfp8_type_node = make_unsigned_type (8); > > which would also give TYPE_MIN_VALUE and TYPE_MAX_VALUE the "right" types. Done, this has reduced the function considerably, thanks! > I was surprised that the tests worked so well with just a standard > integer type, without having to use build_distinct_type_copy. > But since they do, I agree we shouldn't use build_distinct_type_copy > unless a specific reason comes up. > Haven't found a specific reason to up to now. >> >> +/* Implement TARGET_INVALID_CONVERSION. >> + >> +Return the diagnostic message when it is invalid to convert from fromtype to >> +totype, or NULL if validity should be determined by the front end. */ > > The usual style is not to reiterate the hook description, since when > that's happened in the past, the comments have become out of date > wrt the documentation. So just: > > /* Implement TARGET_INVALID_CONVERSION. */ > > should be good enough. Done >> + >> +static const char * >> +aarch64_invalid_conversion (const_tree fromtype, const_tree totype) >> +{ >> + /* Do not allow conversions to/from FP8. */ >> + bool fromtype_is_fp8 >> + = ((fromtype) && (TYPE_MODE (fromtype) == QImode) >> + && (TYPE_MAIN_VARIANT (fromtype) == aarch64_mfp8_type_node)); >> + bool totype_is_fp8 >> + = ((totype) >> + && (TYPE_MODE (totype) == QImode >> + && TYPE_MAIN_VARIANT (totype) == aarch64_mfp8_type_node)); > > Did you see null fromtypes and totypes? It doesn't look like the other > targets have needed to handle them, and it's not clear what the correct > behaviour would be in that case. > > The QImode tests also look redundant. > > Trying it locally, things seemed to work for me with: > > bool fromtype_is_fp8 > = TYPE_MAIN_VARIANT (fromtype) == aarch64_mfp8_type_node; > bool totype_is_fp8 > = TYPE_MAIN_VARIANT (totype) == aarch64_mfp8_type_node; A previous version of this patch was not setting the minval and maxval of the type. This was causing narrowing tests to fail in C++ and the conversion checking would be called with a null fromtype with this code, causing an ICE __mfp8 global_fp8{}; With minval and maxval properly set, the ICEs no longer occur. I'm simplifying the checks as you propose. >> >> +/* Implement TARGET_INVALID_UNARY_OP. >> + >> + Return the diagnostic message string if the unary operation OP is >> + not permitted on TYPE, NULL otherwise. */ > > Similar comment about the comment here. Done > >> + >> +static const char * >> +aarch64_invalid_unary_op (int op, const_tree type) >> +{ >> + /* Reject all single-operand operations on __mfp8 except for &. */ >> + if ((TYPE_MODE (type) == QImode) >> + && (TYPE_MAIN_VARIANT (type) == aarch64_mfp8_type_node) >> + && (op != ADDR_EXPR)) > > Just: > > if (TYPE_MAIN_VARIANT (type) == aarch64_mfp8_type_node > && op != ADDR_EXPR) > > should be enough. (GCC style is not to add brackets around individual > comparisons, unless they're needed or span multiple lines.) Done >> + return N_ ("operation not permitted on type %<mfloat8_t%>"); >> + >> + /* Operation allowed. */ >> + return NULL; >> +} >> + >> /* Return the diagnostic message string if the binary operation OP is >> not permitted on TYPE1 and TYPE2, NULL otherwise. */ >> >> @@ -28982,6 +29029,13 @@ aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, >> const_tree type1, >> != aarch64_sve::builtin_type_p (type2))) >> return N_("cannot combine GNU and SVE vectors in a binary operation"); >> >> + /* Reject all 2-operand operations on __mfp8. */ >> + if (((TYPE_MODE (type1) == QImode) >> + && (TYPE_MAIN_VARIANT (type1) == aarch64_mfp8_type_node)) >> + || ((TYPE_MODE (type2) == QImode) >> + && (TYPE_MAIN_VARIANT (type2) == aarch64_mfp8_type_node))) > > Similarly here. Done > >> + return N_ ("operation not permitted on type %<mfloat8_t%>"); >> + >> /* Operation allowed. */ >> return NULL; >> } >> [...] >> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h >> index e376685489d..0092314cf75 100644 >> --- a/gcc/config/aarch64/arm_neon.h >> +++ b/gcc/config/aarch64/arm_neon.h >> @@ -72,6 +72,8 @@ typedef __Poly16_t poly16_t; >> typedef __Poly64_t poly64_t; >> typedef __Poly128_t poly128_t; >> >> +typedef __mfp8 mfloat8_t; >> + >> typedef __fp16 float16_t; >> typedef float float32_t; >> typedef double float64_t; >> diff --git a/gcc/config/aarch64/arm_sve.h b/gcc/config/aarch64/arm_sve.h >> index aa0bd9909f9..61c440fef56 100644 >> --- a/gcc/config/aarch64/arm_sve.h >> +++ b/gcc/config/aarch64/arm_sve.h >> @@ -29,6 +29,7 @@ >> #include <arm_private_fp8.h> >> #include <arm_bf16.h> >> >> +typedef __mfp8 mfloat8_t; >> typedef __fp16 float16_t; >> typedef float float32_t; >> typedef double float64_t; > > Ultran minor nit, but it'd be good to have a consistent separation style > between arm_neon.h and arm_sve.h. (Don't mind which we use.) Separated __mfp8 in both files. >> diff --git a/gcc/testsuite/gcc.target/aarch64/fp8_scalar_1.c >> b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_1.c >> new file mode 100644 >> index 00000000000..6925653e33c >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_1.c >> @@ -0,0 +1,108 @@ >> +/* Test the fp8 ACLE intrinsics family. */ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -march=armv9.4-a+fp8" } */ >> +/* { dg-final { check-function-bodies "**" "" "" } } */ >> + >> +#include <arm_neon.h> >> + >> +/* >> +**stacktest1: >> +** sub sp, sp, #16 >> +** sxtb w0, w0 > > (This would become an AND if we do switch to an unsigned type.) Yep! >> +** strb w0, \[sp, 15\] >> +** ldrb w0, \[sp, 15\] >> +** add sp, sp, 16 >> +** ret >> +*/ >> +mfloat8_t >> +stacktest1 (mfloat8_t __a) >> +{ >> + volatile mfloat8_t b = __a; >> + return b; >> +} >> + >> +/* >> +**fp8_mov_ww: >> +** dup b1, v2.b\[0\] >> +** ret >> +*/ >> +void >> +fp8_mov_ww (void) >> +{ >> + register mfloat8_t x asm ("h2"); >> + register mfloat8_t y asm ("h1"); >> + asm volatile ("" : "=w"(x)); >> + y = x; >> + asm volatile ("" ::"w"(y)); >> +} >> + >> +/* >> +**fp8_mov_rw: >> +** dup v1.8b, w1 >> +** ret >> +*/ >> +void >> +fp8_mov_rw (void) >> +{ >> + register mfloat8_t x asm ("w1"); >> + register mfloat8_t y asm ("h1"); >> + asm volatile ("" : "=r"(x)); >> + y = x; >> + asm volatile ("" ::"w"(y)); >> +} >> + >> +/* >> +**fp8_mov_wr: >> +** umov w1, v1.b\[0\] >> +** ret >> +*/ >> +void >> +fp8_mov_wr (void) >> +{ >> + register mfloat8_t x asm ("h1"); >> + register mfloat8_t y asm ("w1"); >> + asm volatile ("" : "=w"(x)); >> + y = x; >> + asm volatile ("" ::"r"(y)); >> +} >> + >> +/* >> +**fp8_mov_rr: >> +** mov w1, w2 >> +** ret >> +*/ >> +void >> +fp8_mov_rr (void) >> +{ >> + register mfloat8_t x asm ("w2"); >> + register mfloat8_t y asm ("w1"); >> + asm volatile ("" : "=r"(x)); >> + y = x; >> + asm volatile ("" ::"r"(y)); >> +} >> + >> +/* >> +**fp8_mov_rm: >> +** strb w2, \[x0\] >> +** ret >> +*/ >> +void >> +fp8_mov_rm (mfloat8_t *ptr) >> +{ >> + register mfloat8_t x asm ("w2"); >> + asm volatile ("" : "=r"(x)); >> + *ptr = x; >> +} >> + >> +/* >> +**fp8_mov_mr: >> +** ldrb w2, \[x0\] >> +** ret >> +*/ >> +void >> +fp8_mov_mr (mfloat8_t *ptr) >> +{ >> + register mfloat8_t y asm ("w2"); >> + y = *ptr; >> + asm volatile ("" ::"r"(y)); >> +} > > Nice tests :-) Yay bfloat types! > It would be good to test loads and stores for FPRs as well, > for completeness. > >> diff --git a/gcc/testsuite/gcc.target/aarch64/fp8_scalar_typecheck_1.c >> b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_typecheck_1.c >> new file mode 100644 >> index 00000000000..122dc5aa2b5 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/fp8_scalar_typecheck_1.c >> @@ -0,0 +1,329 @@ >> +/* Test that there is no conversion between ints and mfloat8_t. */ >> +/* { dg-do assemble } */ >> +/* { dg-options "-O1 -march=armv9.4-a+fp8" } */ >> + >> +#include <arm_neon.h> >> +#include <stdint.h> >> + >> +mfloat8_t glob_fp8; >> + >> +int is_an_int; >> +uint8_t is_a_uint8; >> +int8_t is_an_int8; >> +short is_a_short_int; >> +float is_a_float; >> +double is_a_double; >> + >> +uint8_t *uint8_ptr; > > I think we should also test that: > > mfloat8_t extern_fn1(void); > unsigned char extern_fn1(void); > > mfloat8_t extern_fn2(void); > uint8_t extern_fn2(void); > > unsigned char extern_fn3(void); > mfloat8_t extern_fn3(void); > > uint8_t extern_fn4(void); > mfloat8_t extern_fn4(void); Done, also adding a check using mfloat8_t and uint8_t in the argument. > give errors on the second definition. Same for C++. Done, minus the argument overloading. >> + /* Unary operators. */ >> + >> + +scalar0; /* { dg-error {operation not permitted on type 'mfloat8_t'} } >> */ >> + -scalar0; /* { dg-error {operation not permitted on type 'mfloat8_t'} } >> */ >> + ~scalar0; /* { dg-error {operation not permitted on type 'mfloat8_t'} } >> */ >> + !scalar0; /* { dg-error {operation not permitted on type 'mfloat8_t'} } >> */ >> + *scalar0; /* { dg-error {invalid conversion from type 'mfloat8_t'} } */ >> + __real scalar0; /* { dg-error {operation not permitted on type >> 'mfloat8_t'} } */ >> + __imag scalar0; /* { dg-error {operation not permitted on type >> 'mfloat8_t'} } */ >> + ++scalar0; /* { dg-error {operation not permitted on type >> 'mfloat8_t'} } */ >> + --scalar0; /* { dg-error {operation not permitted on type >> 'mfloat8_t'} } */ >> + scalar0++; /* { dg-error {operation not permitted on type >> 'mfloat8_t'} } */ >> + scalar0--; /* { dg-error {operation not permitted on type >> 'mfloat8_t'} } */ >> + >> + /* Binary arithmetic operations. */ >> + >> + scalar0 = glob_fp8 + scalar1_2; /* { dg-error {invalid conversion from >> type 'mfloat8_t'} } */ >> + scalar0 = glob_fp8 + *fp8_ptr; /* { dg-error {invalid conversion from >> type 'mfloat8_t'} } */ >> + scalar0 = glob_fp8 >> + + 0.1; /* { dg-error {invalid conversion from type 'mfloat8_t'} } */ >> + scalar0 >> + = glob_fp8 + 0; /* { dg-error {invalid conversion from type >> 'mfloat8_t'} } */ >> + scalar0 >> + = glob_fp8 >> + + is_a_float; /* { dg-error {invalid conversion from type 'mfloat8_t'} >> } */ > > Hmm, so all the errors for binary operations come from the usual > integer conversions, rather than from the binary operation itself. > That makes sense. > > Could you also test fp8 op fp8 only for the other binary operations, > including && and ||? Done, all fail on invalid conversion from type 'mfloat8_t' > > For C++, it would be good to test: > > #include <type_traits> > > static_assert(!std::is_integral<__mfp8>(), "not integral"); > static_assert(!std::is_signed<__mfp8>(), "not signed"); > static_assert(!std::is_unsigned<__mfp8>(), "not unsigned"); > > to make sure that we maintain the abstraction. Was kind-of surprised > that this Just Works -- didn't look into the mechanics of how it does. Done > Thanks, > Richard Thank you! I'll ask Andrew to check and post the updated version for me as I'll be on holidays. Cheers, Claudio