> On 20 Aug 2024, at 20:10, Richard Sandiford <richard.sandif...@arm.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> The testcase contains a VNx2QImode pseudo that is live across a call
> and that cannot be allocated a call-preserved register.  LRA quite
> reasonably tried to save it before the call and restore it afterwards.
> Unfortunately, the target told it to do that in SImode, even though
> punning between SImode and VNx2QImode is disallowed by both
> TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P.
> 
> The natural class to use for SImode is GENERAL_REGS, so this led
> to an unsalvageable situation in which we had:
> 
>  (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B))
> 
> where A needed GENERAL_REGS and B needed FP_REGS.  We therefore ended
> up in a reload loop.
> 
> The hooks above should ensure that this situation can never occur
> for incoming subregs.  It only happened here because the target
> explicitly forced it.
> 
> The decision to use SImode for modes smaller than 4 bytes dates
> back to the beginning of the port, before 16-bit floating-point
> modes existed.  I'm not sure whether promoting to SImode really
> makes sense for any FPR, but that's a separate performance/QoI
> discussion.  For now, this patch just disallows using SImode
> when it is wrong for correctness reasons, since that should be
> safer to backport.
> 
> Bootstrapped & regression-tested on aarch64-linux-gnu.  I'll leave
> a day or so for comments before pushing.

The explanation makes sense to me FWIW.
Thanks for fixing this.
Kyrill

> 
> Richard
> 
> 
> gcc/
>        PR testsuite/116238
>        * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode):
>        Only return SImode if we can convert to and from it.
> 
> gcc/testsuite/
>        PR testsuite/116238
>        * gcc.target/aarch64/sve/pr116238.c: New test.
> ---
> gcc/config/aarch64/aarch64.cc                   |  7 ++++---
> gcc/testsuite/gcc.target/aarch64/sve/pr116238.c | 13 +++++++++++++
> 2 files changed, 17 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index bfd7bcdef7c..4e312c43576 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -2521,10 +2521,11 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, 
> unsigned,
>      unnecessarily significant.  */
>   if (PR_REGNUM_P (regno))
>     return mode;
> -  if (known_ge (GET_MODE_SIZE (mode), 4))
> -    return mode;
> -  else
> +  if (known_lt (GET_MODE_SIZE (mode), 4)
> +      && REG_CAN_CHANGE_MODE_P (regno, mode, SImode)
> +      && REG_CAN_CHANGE_MODE_P (regno, SImode, mode))
>     return SImode;
> +  return mode;
> }
> 
> /* Return true if I's bits are consecutive ones from the MSB.  */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> new file mode 100644
> index 00000000000..fe66b198107
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> @@ -0,0 +1,13 @@
> +/* { dg-additional-options "-O2 -msve-vector-bits=128" } */
> +
> +void foo();
> +typedef unsigned char v2qi __attribute__((vector_size(2)));
> +void f(v2qi *ptr)
> +{
> +  v2qi x = *ptr;
> +  asm volatile ("" :: "w" (x));
> +  asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
> +  foo();
> +  asm volatile ("" :: "w" (x));
> +  *ptr = x;
> +}
> --
> 2.25.1
> 

Reply via email to