On 22/07/2024 10:51, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 22 Jul 2024, at 11:30, Claudio Bantaloukas <claudio.bantalou...@arm.com> >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> Unlike most system registers, fpmr can be heavily written to in code that >> exercises the fp8 functionality. That is because every fp8 instrinsic call >> can potentially change the value of fpmr. >> Rather than just use a an unspec, we treat the fpmr system register like >> all other registers and use a move operation to read and write to it. >> >> We introduce a new class of moveable system registers that, currently, >> only accepts fpmr and a new constraint, Umv, that allows us to >> selectively use mrs and msr instructions when expanding rtl for them. >> Given that there is code that depends on "real" registers coming before >> "fake" ones, we introduce a new constant FPM_REGNUM that uses an >> existing value and renumber registers below that. >> This requires us to update the bitmaps that describe which registers >> belong to each register class. >> > > This approach sounds reasonable to me. Some comments inline below. >
-----8<----- >> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md >> index 94ff0eefa77..b654c0b9bb8 100644 >> --- a/gcc/config/aarch64/aarch64.md >> +++ b/gcc/config/aarch64/aarch64.md >> @@ -107,10 +107,14 @@ (define_constants >> (P14_REGNUM 82) >> (P15_REGNUM 83) >> (LAST_SAVED_REGNUM 83) >> - (FFR_REGNUM 84) >> + >> + ;; Floating Point Mode Register, used in FP8 insns. >> + (FPM_REGNUM 84) >> + > > Why not define FPM_REGNUM... > >> + (FFR_REGNUM 85) >> ;; "FFR token": a fake register used for representing the scheduling >> ;; restrictions on FFR-related operations. >> - (FFRT_REGNUM 85) >> + (FFRT_REGNUM 86) >> >> ;; ---------------------------------------------------------------- >> ;; Fake registers >> @@ -122,17 +126,17 @@ (define_constants >> ;; ABI-related lowering is needed. These placeholders read and >> ;; write this register. Instructions that depend on the lowering >> ;; read the register. >> - (LOWERING_REGNUM 86) >> + (LOWERING_REGNUM 87) >> >> ;; Represents the contents of the current function's TPIDR2 block, >> ;; in abstract form. >> - (TPIDR2_BLOCK_REGNUM 87) >> + (TPIDR2_BLOCK_REGNUM 88) >> >> ;; Holds the value that the current function wants PSTATE.ZA to be. >> ;; The actual value can sometimes vary, because it does not track >> ;; changes to PSTATE.ZA that happen during a lazy save and restore. >> ;; Those effects are instead tracked by ZA_SAVED_REGNUM. >> - (SME_STATE_REGNUM 88) >> + (SME_STATE_REGNUM 89) >> >> ;; Instructions write to this register if they set TPIDR2_EL0 to a >> ;; well-defined value. Instructions read from the register if they >> @@ -140,14 +144,14 @@ (define_constants >> ;; >> ;; The register does not model the architected TPIDR2_ELO, just the >> ;; current function's management of it. >> - (TPIDR2_SETUP_REGNUM 89) >> + (TPIDR2_SETUP_REGNUM 90) >> >> ;; Represents the property "has an incoming lazy save been committed?". >> - (ZA_FREE_REGNUM 90) >> + (ZA_FREE_REGNUM 91) >> >> ;; Represents the property "are the current function's ZA contents >> ;; stored in the lazy save buffer, rather than in ZA itself?". >> - (ZA_SAVED_REGNUM 91) >> + (ZA_SAVED_REGNUM 92) >> >> ;; Represents the contents of the current function's ZA state in >> ;; abstract form. At various times in the function, these contents >> @@ -155,10 +159,10 @@ (define_constants >> ;; >> ;; The contents persist even when the architected ZA is off. Private-ZA >> ;; functions have no effect on its contents. >> - (ZA_REGNUM 92) >> + (ZA_REGNUM 93) >> >> ;; Similarly represents the contents of the current function's ZT0 >> state. >> - (ZT0_REGNUM 93) >> + (ZT0_REGNUM 94) >> > > …. Here as 95. Then you’d avoid having to update all the other regnums to new > values. I don’t think we aim to have these register names in alphabetical > order... Hi Kyryl, thank you for the lightning fast review. The registers are split into two contiguous regions, "non-fake" and "fake", and I notice that, in aarch64.h #define FIRST_PSEUDO_REGISTER (LAST_FAKE_REGNUM + 1) I admit I don't know the difference between a fake and a pseudo register (the words are synonyms to me!) but it seems to me that the code is creating these regions on purpose and maintaining their contiguity is useful. > > >> (FIRST_FAKE_REGNUM LOWERING_REGNUM) >> (LAST_FAKE_REGNUM ZT0_REGNUM) -----8<----- >> diff --git a/gcc/testsuite/gcc.target/aarch64/acle/fp8.c >> b/gcc/testsuite/gcc.target/aarch64/acle/fp8.c >> index b774f28c9f0..10fd128d8f9 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/acle/fp8.c >> +++ b/gcc/testsuite/gcc.target/aarch64/acle/fp8.c >> @@ -1,14 +1,14 @@ >> /* Test the fp8 ACLE intrinsics family. */ >> /* { dg-do compile } */ >> /* { dg-options "-O1 -march=armv8-a" } */ >> -/* { dg-final { check-function-bodies "**" "" } } */ >> +/* { dg-final { check-function-bodies "**" "" "" } } */ >> >> #ifdef __ARM_FEATURE_FP8 >> #error "__ARM_FEATURE_FP8 feature macro defined." >> #endif >> >> #pragma GCC push_options >> -#pragma GCC target ("arch=armv9.4-a+fp8") >> +#pragma GCC target("arch=armv9.4-a+fp8") >> > > Some unnecessary whitespace changes? > Anyway, this feature macro test should be at the end of the FP8 support as > discussed in patch 1/3. > Thanks, > Kyrill > Thanks for catching, I'll fix in the next iteration. Regarding defines, I'm asking around what the "right thing" is as I'm new to ACLE wording. If the helper macros to set fpmr can be unconditionally defined, then there's no problem in adding the feature macro for last. If we want them to be defined only when the user adds +fp8, then the feature macro has to come in first or another method found. Cheers, Claudio