On Mon, May 10, 2021 at 2:39 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Fri, Apr 30, 2021 at 8:30 PM Richard Sandiford via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >> > On Fri, Apr 30, 2021 at 5:49 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >> >>
> >> >> On Fri, Apr 30, 2021 at 5:42 AM Richard Sandiford
> >> >> <richard.sandif...@arm.com> wrote:
> >> >> >
> >> >> > "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >> >> > > On Fri, Apr 30, 2021 at 2:06 AM Richard Sandiford
> >> >> > > <richard.sandif...@arm.com> wrote:
> >> >> > >>
> >> >> > >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >> >> > >> > gen_reg_rtx tracks stack alignment needed for pseudo registers 
> >> >> > >> > so that
> >> >> > >> > associated hard registers can be properly spilled onto stack.  
> >> >> > >> > But there
> >> >> > >> > are cases where associated hard registers will never be spilled 
> >> >> > >> > onto
> >> >> > >> > stack.  gen_reg_rtx is changed to take an argument for register 
> >> >> > >> > alignment
> >> >> > >> > so that stack realignment can be avoided when not needed.
> >> >> > >>
> >> >> > >> How is it guaranteed that they will never be spilled though?
> >> >> > >> I don't think that that guarantee exists for any kind of pseudo,
> >> >> > >> except perhaps for the temporary pseudos that the RA creates to
> >> >> > >> replace (match_scratch …)es.
> >> >> > >>
> >> >> > >
> >> >> > > The caller of creating pseudo registers with specific alignment must
> >> >> > > guarantee that they will never be spilled.   I am only using it in
> >> >> > >
> >> >> > >   /* Make operand1 a register if it isn't already.  */
> >> >> > >   if (can_create_pseudo_p ()
> >> >> > >       && !register_operand (op0, mode)
> >> >> > >       && !register_operand (op1, mode))
> >> >> > >     {
> >> >> > >       /* NB: Don't increase stack alignment requirement when forcing
> >> >> > >          operand1 into a pseudo register to copy data from one 
> >> >> > > memory
> >> >> > >          location to another since it doesn't require a spill.  */
> >> >> > >       emit_move_insn (op0,
> >> >> > >                       force_reg (GET_MODE (op0), op1,
> >> >> > >                                  (UNITS_PER_WORD * BITS_PER_UNIT)));
> >> >> > >       return;
> >> >> > >     }
> >> >> > >
> >> >> > > for vector moves.  RA shouldn't spill it.
> >> >> >
> >> >> > But this is the point: it's a case of hoping that the RA won't spill 
> >> >> > it,
> >> >> > rather than having a guarantee that it won't.
> >> >> >
> >> >> > Even if the moves start out adjacent, they could be separated by later
> >> >> > RTL optimisations, particularly scheduling.  (I realise pre-RA 
> >> >> > scheduling
> >> >> > isn't enabled by default for x86, but it can still be enabled 
> >> >> > explicitly.)
> >> >> > Or if the same data is being copied to two locations, we might reuse
> >> >> > values loaded by the first copy for the second copy as well.
> >> >
> >> > There are cases where pseudo vector registers are created as pure
> >> > temporary registers in the backend and they shouldn't ever be spilled
> >> > to stack.   They will be spilled to stack only if there are other 
> >> > non-temporary
> >> > vector register usage in which case stack will be properly re-aligned.
> >> > Caller of creating pseudo registers with specific alignment guarantees
> >> > that they are used only as pure temporary registers.
> >>
> >> I don't think there's really a distinct category of pure temporary
> >> registers though.  The things I mentioned above can happen for any
> >> kind of pseudo register.
> >
> > I wonder if for the cases HJ thinks of it is appropriate to use hardregs?
> > Do we generally handle those well?  That is, are they again subject
> > to be allocated by RA when no longer live?
>
> Yeah, using hard registers should work.  Of course, any given fixed choice
> of hard register has the potential to be suboptimal in some situation,
> but it should be safe.

I tried hard registers.  The generated code isn't as good as pseudo registers.
But I want to avoid align the shack when YMM registers are only used to
inline memcpy/memset.  Any suggestions?

Thanks.

-- 
H.J.

Reply via email to