"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> >> > +     {
>> >> > +       /* First generate subreg of word mode if the previous mode is
>> >> > +          wider than word mode and word mode is wider than MODE.  */
>> >> > +       prev_rtx = simplify_gen_subreg (word_mode, prev_rtx,
>> >> > +                                       prev_mode, 0);
>> >> > +       prev_mode = word_mode;
>> >> > +     }
>> >> > +      if (prev_rtx != nullptr)
>> >> > +     target = simplify_gen_subreg (mode, prev_rtx, prev_mode, 0);
>> >>
>> >> This should be lowpart_subreg, since 0 isn't the right offset for
>> >> big-endian targets.  Using lowpart_subreg should also avoid the need
>> >> for the word_size “if” above: lowpart_subreg can handle lowpart subword
>> >> subregs of multiword values.
>> >
>> > I tried it.  It didn't work since it caused the LRA failure.   I replaced
>> > simplify_gen_subreg with lowpart_subreg instead.
>>
>> What specifically went wrong?
>
> With vector broadcast, for
> ---
> extern void *ops;
>
> void
> foo (int c)
> {
>   __builtin_memset (ops, c, 18);
> }
> ---
> we generate HI from V16QI.   With a single lowpart_subreg, I get
>
> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84)
>                 (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void
> *)ops.0_1]+16 S2 A8])
>         (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal}
>      (nil))
>
> instead of
>
> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84)
>                 (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void
> *)ops.0_1]+16 S2 A8])
>         (subreg:HI (reg:DI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal}
>      (nil))
>
> IRA and LRA fail to reload:
>
> (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84)
>                 (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void
> *)ops.0_1]+16 S2 A8])
>         (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal}
>      (nil))
>
> since ix86_can_change_mode_class has
>
>   if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
>     {
>       /* Vector registers do not support QI or HImode loads.  If we don't
>          disallow a change to these modes, reload will assume it's ok to
>          drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
>          the vec_dupv4hi pattern.  */
>       if (GET_MODE_SIZE (from) < 4)
>         return false;
>     }

Ah!  OK.  In that case, maybe we should have something like:

   if (REG_P (prev_rtx)
       && HARD_REGISTER_P (prev_rtx)
       && REG_CAN_CHANGE_MODE_P (REGNO (prev_rtx), prev->mode, mode))
     prev_rtx = copy_to_reg (prev_rtx);

and then just have the single lowpart_subreg after that.

Thanks,
Richard

Reply via email to