On Tue, Jul 20, 2021 at 8:12 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >> On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford
> >> <richard.sandif...@arm.com> wrote:
> >>>
> >>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >>> >> > +     {
> >>> >> > +       /* First generate subreg of word mode if the previous mode is
> >>> >> > +          wider than word mode and word mode is wider than MODE.  */
> >>> >> > +       prev_rtx = simplify_gen_subreg (word_mode, prev_rtx,
> >>> >> > +                                       prev_mode, 0);
> >>> >> > +       prev_mode = word_mode;
> >>> >> > +     }
> >>> >> > +      if (prev_rtx != nullptr)
> >>> >> > +     target = simplify_gen_subreg (mode, prev_rtx, prev_mode, 0);
> >>> >>
> >>> >> This should be lowpart_subreg, since 0 isn't the right offset for
> >>> >> big-endian targets.  Using lowpart_subreg should also avoid the need
> >>> >> for the word_size “if” above: lowpart_subreg can handle lowpart subword
> >>> >> subregs of multiword values.
> >>> >
> >>> > I tried it.  It didn't work since it caused the LRA failure.   I 
> >>> > replaced
> >>> > simplify_gen_subreg with lowpart_subreg instead.
> >>>
> >>> What specifically went wrong?
> >>
> >> With vector broadcast, for
> >> ---
> >> extern void *ops;
> >>
> >> void
> >> foo (int c)
> >> {
> >>   __builtin_memset (ops, c, 18);
> >> }
> >> ---
> >> we generate HI from V16QI.   With a single lowpart_subreg, I get
> >>
> >> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84)
> >>                 (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void
> >> *)ops.0_1]+16 S2 A8])
> >>         (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 
> >> {*movhi_internal}
> >>      (nil))
> >>
> >> instead of
> >>
> >> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84)
> >>                 (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void
> >> *)ops.0_1]+16 S2 A8])
> >>         (subreg:HI (reg:DI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal}
> >>      (nil))
> >>
> >> IRA and LRA fail to reload:
> >>
> >> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84)
> >>                 (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void
> >> *)ops.0_1]+16 S2 A8])
> >>         (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 
> >> {*movhi_internal}
> >>      (nil))
> >>
> >> since ix86_can_change_mode_class has
> >>
> >>   if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
> >>     {
> >>       /* Vector registers do not support QI or HImode loads.  If we don't
> >>          disallow a change to these modes, reload will assume it's ok to
> >>          drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
> >>          the vec_dupv4hi pattern.  */
> >>       if (GET_MODE_SIZE (from) < 4)
> >>         return false;
> >>     }
> >
> > Ah!  OK.  In that case, maybe we should have something like:
> >
> >    if (REG_P (prev_rtx)
> >        && HARD_REGISTER_P (prev_rtx)
> >        && REG_CAN_CHANGE_MODE_P (REGNO (prev_rtx), prev->mode, mode))
>
> Sorry, make that last line:
>
>   && lowpart_subreg_regno (REGNO (prev_rtx), prev->mode, mode) < 0
>
> where lowpart_subreg_regno is a new wrapper around simplify_subreg_regno
> that uses subreg_lowpart_offset (mode, prev->mode) as the offset.

Fixed.  I submitted the v3 patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575670.html

Thanks.

> Thanks,
> Richard
>
> >      prev_rtx = copy_to_reg (prev_rtx);
> >
> > and then just have the single lowpart_subreg after that.
> >
> > Thanks,
> > Richard



-- 
H.J.

Reply via email to