On Fri, 2020-09-11 at 12:17 +0200, Ilya Leoshkevich wrote:
> On Fri, 2020-09-11 at 10:46 +0100, Richard Sandiford wrote:
> > Ilya Leoshkevich via Gcc <gcc@gcc.gnu.org> writes:
> > > On Wed, 2020-09-09 at 16:09 -0500, Segher Boessenkool wrote:
> > > > Hi Ilya,
> > > > 
> > > > On Wed, Sep 09, 2020 at 11:50:56AM +0200, Ilya Leoshkevich via
> > > > Gcc
> > > > wrote:
> > > > > I have a vector pseudo containing a single 128-bit value
> > > > > (V1TFmode)
> > > > > and
> > > > > I need to access its last 64 bits (DFmode). Which of the two
> > > > > options
> > > > > is better?
> > > > > 
> > > > > (subreg:DF (reg:V1TF) 8)
> > > > > 
> > > > > or
> > > > > 
> > > > > (vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel
> > > > > [(const_int
> > > > > 1)]))
> > > > > 
> > > > > If I use the first one, I run into a problem with set_noop_p
> > > > > (): it
> > > > > thinks that
> > > > > 
> > > > > (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8))
> > > > > 
> > > > > is a no-op, because it doesn't check the mode after stripping
> > > > > the
> > > > > subreg:
> > > > > 
> > > > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616
> > > > > 
> > > > > However this is not correct, because SET_DEST is the second
> > > > > register in
> > > > > a register pair, and SET_SRC is half of a vector register
> > > > > that
> > > > > overlaps
> > > > > the first register in the corresponding pair. So it looks as
> > > > > if
> > > > > mode
> > > > > needs to be considered there.
> > > > 
> > > > Yes.
> > > > 
> > > > > This helps:
> > > > > 
> > > > > --- a/gcc/rtlanal.c
> > > > > +++ b/gcc/rtlanal.c
> > > > > @@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set)
> > > > >         return 0;
> > > > >        src = SUBREG_REG (src);
> > > > >        dst = SUBREG_REG (dst);
> > > > > +      if (GET_MODE (src) != GET_MODE (dst))
> > > > > +       return 0;
> > > > >      }
> > > > > 
> > > > > but I'm not sure whether I'm not missing something about
> > > > > subreg
> > > > > semantics in the first place.
> > > > 
> > > > You probably should just see if both modes are the same number
> > > > of
> > > > hard
> > > > registers?  HARD_REGNO_NREGS.
> > > 
> > > I've refined my patch as follows:
> > > 
> > > --- a/gcc/rtlanal.c
> > > +++ b/gcc/rtlanal.c
> > > @@ -1619,6 +1619,11 @@ set_noop_p (const_rtx set)
> > >         return 0;
> > >        src = SUBREG_REG (src);
> > >        dst = SUBREG_REG (dst);
> > > +      if (REG_P (src) && HARD_REGISTER_P (src) && REG_P (dst)
> > > +         && HARD_REGISTER_P (dst)
> > > +         && hard_regno_nregs (REGNO (src), GET_MODE (src))
> > > +                != hard_regno_nregs (REGNO (dst), GET_MODE
> > > (dst)))
> > > +       return 0;
> > >      }
> > 
> > I think checking the mode would be safer.  Having the same number
> > of registers doesn't mean that the bits are distributed across the
> > registers in the same way.
> 
> Yeah, that's what I was trying to express with this hypothetical
> machine example.  On the other hand, checking mode is too
> pessimistic.
> E.g. if we talk about s390 GPRs, then considering
> 
> (set (subreg:SI (reg:DI %r0) 4) (subreg:SI (reg:DI %r0) 4))

Sorry, bad example: here the hard register modes actually match.
But it's probably possible to come up with something similar, where
the hard reg is accessed with different modes, but when we add subregs
on top, then we end up selecting the same bits.

> a no-op is fine from my perspective.  So having a more restrictive
> check might be desirable.  Is there a way to ask the backend how the
> subreg bits are distributed?
> 
> > Out of interest, why can't the subregs in the example above get
> > folded down to hard registers?
> 
> I think this is because the offsets are not 0.  I could imagine
> folding
> (subreg:DF (reg:TF %f0) 8) to (reg:DF %f2) - but there must be a
> backend hook for this.  Does anything like this exist?  Also, can
> (subreg:DF (reg:V1TF %f0) 8) be folded at all? This is simply
> the second doubleword of 128-bit %v0 vector register.

Reply via email to