"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes:
> On 20/01/2022 09:14, Christophe Lyon wrote:
>>
>>
>> On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches 
>> <gcc-patches@gcc.gnu.org> wrote:
>>
>>     Hi Christophe,
>>
>>     On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
>>     > At some point during the development of this patch series, it
>>     appeared
>>     > that in some cases the register allocator wants “VPR or general”
>>     > rather than “VPR or general or FP” (which is the same thing as
>>     > ALL_REGS).  The series does not seem to require this anymore, but it
>>     > seems to be a good thing to do anyway, to give the register
>>     allocator
>>     > more freedom.
>>     Not sure I fully understand this, but I guess it creates an extra
>>     class
>>     the register allocator can use to group things that can go into
>>     VPR or
>>     general reg?
>>     >
>>     > CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a
>>     > regression in gcc.dg/stack-usage-1.c when compiled with -mthumb
>>     > -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp.
>>     I have not looked into this failure, but ...
>>     >
>>     > 2022-01-13  Christophe Lyon  <christophe.l...@foss.st.com>
>>     >
>>     >       gcc/
>>     >       * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>>     >       (REG_CLASS_NAMES): Likewise.
>>     >       (REG_CLASS_CONTENTS): Likewise.
>>     >       (CLASS_MAX_NREGS): Handle VPR.
>>     >       * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR.
>>     >
>>     > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>     > index bb75921f32d..c3559ca8703 100644
>>     > --- a/gcc/config/arm/arm.c
>>     > +++ b/gcc/config/arm/arm.c
>>     > @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream)
>>     >   static unsigned int
>>     >   arm_hard_regno_nregs (unsigned int regno, machine_mode mode)
>>     >   {
>>     > +  if (IS_VPR_REGNUM (regno))
>>     > +    return CEIL (GET_MODE_SIZE (mode), 2);
>>     When do we ever want to use more than 1 register for VPR?
>>
>>
>> That was tricky.
>> Richard Sandiford helped me analyze the problem, I guess I can quote him:
>>
>> RS> I think the problem is a combination of a few things:
>> RS>
>> RS> (1) arm_hard_regno_mode_ok rejects SImode in VPR, so SImode moves
>> RS>     to or from the VPR_REG class get the maximum cost.
>> RS>
>> RS> (2) IRA thinks from CLASS_MAX_NREGS and arm_hard_regno_nregs that
>> RS>    VPR is big enough to hold SImode.
>> RS>
>> RS> (3) If a class C1 is a superset of a class C2, and if C2 is big enough
>> RS>     to hold a mode M, IRA ensures that move costs for M involving C1
>> RS>     are >= move costs for M involving C2.
>> RS>
>> RS> (1) is correct but (2) isn't.  IMO (3) is dubious: the trigger should
>> RS> be whether C2 is actually allowed to hold M, not whether C2 is big 
>> enough
>> RS> to hold M.  However, changing that is likely to cause problems 
>> elsewhere,
>> RS> and could lead to classes like GENERAL_AND_FP_REGS being used when
>> RS> FP_REGS are disabled (which might be confusing).
>> RS>
>
> I understand everything up until here.
>
>> RS> “Fixing” (2) using:
>> RS>
>> RS>  CEIL (GET_MODE_SIZE (mode), 2)
> I was wondering why not just return '1' for VPR_REGNUM, rather than use 
> the fact that the mode-size we use for VPR is 2 bytes, so diving it by 2 
> makes 1. Unless we ever decide to use a larger mode for VPR, maybe 
> that's what this is trying to address? I can't imagine we would ever 
> need to though since for MVE there is only one VPR register and it is 
> always 16-bits. Just feels overly complicated to me.

For context, that's what the first version did, and is what led to
the reload failure.  The above is trying to explain why returning
1 doesn't work in practice.

To put (2) a slightly different way: if the port says VPR_REGNUM takes
1 register regardless of the mode passed in, the port is effectively
saying that VPR (and thus VPR_REGNUM) has enough bits to hold *any* mode
passed in (SImode, DImode, etc.).  It actually makes VPR seem bigger
than a general register.

In the particular case of the reload failure, returning 1 effectively
tells the RA that VPR is big enough to hold SImode, but that the port is
nevertheless choosing not to allow VPR to be used to hold SImode.  This
then “infects” the SImode cost of GENERAL_AND_VPR_REGS.

Thanks,
Richard

Reply via email to