"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: > On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: >> >> > + { >> >> > + /* First generate subreg of word mode if the previous mode is >> >> > + wider than word mode and word mode is wider than MODE. */ >> >> > + prev_rtx = simplify_gen_subreg (word_mode, prev_rtx, >> >> > + prev_mode, 0); >> >> > + prev_mode = word_mode; >> >> > + } >> >> > + if (prev_rtx != nullptr) >> >> > + target = simplify_gen_subreg (mode, prev_rtx, prev_mode, 0); >> >> >> >> This should be lowpart_subreg, since 0 isn't the right offset for >> >> big-endian targets. Using lowpart_subreg should also avoid the need >> >> for the word_size “if” above: lowpart_subreg can handle lowpart subword >> >> subregs of multiword values. >> > >> > I tried it. It didn't work since it caused the LRA failure. I replaced >> > simplify_gen_subreg with lowpart_subreg instead. >> >> What specifically went wrong? > > With vector broadcast, for > --- > extern void *ops; > > void > foo (int c) > { > __builtin_memset (ops, c, 18); > } > --- > we generate HI from V16QI. With a single lowpart_subreg, I get > > (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84) > (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void > *)ops.0_1]+16 S2 A8]) > (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal} > (nil)) > > instead of > > (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84) > (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void > *)ops.0_1]+16 S2 A8]) > (subreg:HI (reg:DI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal} > (nil)) > > IRA and LRA fail to reload: > > (insn 10 9 0 2 (set (mem:HI (plus:DI (reg/f:DI 84) > (const_int 16 [0x10])) [0 MEM <char[1:18]> [(void > *)ops.0_1]+16 S2 A8]) > (subreg:HI (reg:V16QI 51 xmm15) 0)) "s2a.i":6:3 76 {*movhi_internal} > (nil)) > > since ix86_can_change_mode_class has > > if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass)) > { > /* Vector registers do not support QI or HImode loads. If we don't > disallow a change to these modes, reload will assume it's ok to > drop the subreg from (subreg:SI (reg:HI 100) 0). This affects > the vec_dupv4hi pattern. */ > if (GET_MODE_SIZE (from) < 4) > return false; > }
Ah! OK. In that case, maybe we should have something like: if (REG_P (prev_rtx) && HARD_REGISTER_P (prev_rtx) && REG_CAN_CHANGE_MODE_P (REGNO (prev_rtx), prev->mode, mode)) prev_rtx = copy_to_reg (prev_rtx); and then just have the single lowpart_subreg after that. Thanks, Richard