"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: > On Mon, Jul 26, 2021 at 2:53 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: >> > On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford >> > <richard.sandif...@arm.com> wrote: >> >> >> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: >> >> > +to avoid stack realignment when expanding memset. The default is >> >> > +@code{gen_reg_rtx}. >> >> > +@end deftypefn >> >> > + >> >> > @deftypefn {Target Hook} unsigned TARGET_LOOP_UNROLL_ADJUST (unsigned >> >> > @var{nunroll}, class loop *@var{loop}) >> >> > This target hook returns a new value for the number of times @var{loop} >> >> > should be unrolled. The parameter @var{nunroll} is the number of times >> >> > […] >> >> > @@ -1446,7 +1511,10 @@ can_store_by_pieces (unsigned HOST_WIDE_INT len, >> >> > max_size = STORE_MAX_PIECES + 1; >> >> > while (max_size > 1 && l > 0) >> >> > { >> >> > - scalar_int_mode mode = widest_int_mode_for_size (max_size); >> >> > + /* Since this can be called before virtual registers are ready >> >> > + to use, avoid QI vector mode here. */ >> >> > + fixed_size_mode mode >> >> > + = widest_fixed_size_mode_for_size (max_size, false); >> >> >> >> I think I might have asked this before, sorry, but: when is that true >> >> and why does it matter? >> > >> > can_store_by_pieces may be called: >> > >> > value-prof.c: if (!can_store_by_pieces (val, builtin_memset_read_str, >> > value-prof.c: if (!can_store_by_pieces (val, builtin_memset_read_str, >> > >> > before virtual registers can be used. When true is passed to >> > widest_fixed_size_mode_for_size, virtual registers may be used >> > to expand memset to broadcast, which leads to ICE. Since for the >> > purpose of can_store_by_pieces, we don't need to expand memset >> > to broadcast and pass false here can avoid ICE. >> >> Ah, I see, thanks. >> >> That sounds like a problem in the way that the memset const function is >> written though. can_store_by_pieces is just a query function, so I don't >> think it should be trying to create new registers for can_store_by_pieces, >> even if it could. At the same time, can_store_by_pieces should make the >> same choices as the real expander would. >> >> I think this means that: >> >> - gen_memset_broadcast should be inlined into its callers, with the >> builtin_memset_read_str getting the CONST_INT_P case and >> builtin_memset_gen_str getting the variable case. >> >> - builtin_memset_read_str should then stop at and return the >> gen_const_vec_duplicate when the prev argument is null. >> Only when prev is nonnull should it go on to call the hook >> and copy the constant to the register that the hook returns. > > How about keeping gen_memset_broadcast and passing PREV to it: > > rtx target; > if (CONST_INT_P (data)) > { > rtx const_vec = gen_const_vec_duplicate (mode, data); > if (prev == NULL) > /* Return CONST_VECTOR when called by a query function. */ > target = const_vec; > else > { > /* Use the move expander with CONST_VECTOR. */ > target = targetm.gen_memset_scratch_rtx (mode); > emit_move_insn (target, const_vec); > } > } > else > { > target = targetm.gen_memset_scratch_rtx (mode); > class expand_operand ops[2]; > create_output_operand (&ops[0], target, mode); > create_input_operand (&ops[1], data, QImode); > expand_insn (icode, 2, ops); > if (!rtx_equal_p (target, ops[0].value)) > emit_move_insn (target, ops[0].value); > }
TBH I think that complicates the interface too much. The constant and non-constant cases are now very different. Thanks, Richard