> On 20 Aug 2024, at 20:10, Richard Sandiford <richard.sandif...@arm.com> wrote:
>
> External email: Use caution opening links or attachments
>
>
> The testcase contains a VNx2QImode pseudo that is live across a call
> and that cannot be allocated a call-preserved register. LRA quite
> reasonably tried to save it before the call and restore it afterwards.
> Unfortunately, the target told it to do that in SImode, even though
> punning between SImode and VNx2QImode is disallowed by both
> TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P.
>
> The natural class to use for SImode is GENERAL_REGS, so this led
> to an unsalvageable situation in which we had:
>
> (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B))
>
> where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended
> up in a reload loop.
>
> The hooks above should ensure that this situation can never occur
> for incoming subregs. It only happened here because the target
> explicitly forced it.
>
> The decision to use SImode for modes smaller than 4 bytes dates
> back to the beginning of the port, before 16-bit floating-point
> modes existed. I'm not sure whether promoting to SImode really
> makes sense for any FPR, but that's a separate performance/QoI
> discussion. For now, this patch just disallows using SImode
> when it is wrong for correctness reasons, since that should be
> safer to backport.
>
> Bootstrapped & regression-tested on aarch64-linux-gnu. I'll leave
> a day or so for comments before pushing.
The explanation makes sense to me FWIW.
Thanks for fixing this.
Kyrill
>
> Richard
>
>
> gcc/
> PR testsuite/116238
> * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode):
> Only return SImode if we can convert to and from it.
>
> gcc/testsuite/
> PR testsuite/116238
> * gcc.target/aarch64/sve/pr116238.c: New test.
> ---
> gcc/config/aarch64/aarch64.cc | 7 ++++---
> gcc/testsuite/gcc.target/aarch64/sve/pr116238.c | 13 +++++++++++++
> 2 files changed, 17 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index bfd7bcdef7c..4e312c43576 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -2521,10 +2521,11 @@ aarch64_hard_regno_caller_save_mode (unsigned regno,
> unsigned,
> unnecessarily significant. */
> if (PR_REGNUM_P (regno))
> return mode;
> - if (known_ge (GET_MODE_SIZE (mode), 4))
> - return mode;
> - else
> + if (known_lt (GET_MODE_SIZE (mode), 4)
> + && REG_CAN_CHANGE_MODE_P (regno, mode, SImode)
> + && REG_CAN_CHANGE_MODE_P (regno, SImode, mode))
> return SImode;
> + return mode;
> }
>
> /* Return true if I's bits are consecutive ones from the MSB. */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> new file mode 100644
> index 00000000000..fe66b198107
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> @@ -0,0 +1,13 @@
> +/* { dg-additional-options "-O2 -msve-vector-bits=128" } */
> +
> +void foo();
> +typedef unsigned char v2qi __attribute__((vector_size(2)));
> +void f(v2qi *ptr)
> +{
> + v2qi x = *ptr;
> + asm volatile ("" :: "w" (x));
> + asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
> + foo();
> + asm volatile ("" :: "w" (x));
> + *ptr = x;
> +}
> --
> 2.25.1
>