"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> On Mon, Jul 26, 2021 at 2:53 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> > On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford
>> > <richard.sandif...@arm.com> wrote:
>> >>
>> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> >> > +to avoid stack realignment when expanding memset.  The default is
>> >> > +@code{gen_reg_rtx}.
>> >> > +@end deftypefn
>> >> > +
>> >> >  @deftypefn {Target Hook} unsigned TARGET_LOOP_UNROLL_ADJUST (unsigned 
>> >> > @var{nunroll}, class loop *@var{loop})
>> >> >  This target hook returns a new value for the number of times @var{loop}
>> >> >  should be unrolled. The parameter @var{nunroll} is the number of times
>> >> > […]
>> >> > @@ -1446,7 +1511,10 @@ can_store_by_pieces (unsigned HOST_WIDE_INT len,
>> >> >        max_size = STORE_MAX_PIECES + 1;
>> >> >        while (max_size > 1 && l > 0)
>> >> >       {
>> >> > -       scalar_int_mode mode = widest_int_mode_for_size (max_size);
>> >> > +       /* Since this can be called before virtual registers are ready
>> >> > +          to use, avoid QI vector mode here.  */
>> >> > +       fixed_size_mode mode
>> >> > +         = widest_fixed_size_mode_for_size (max_size, false);
>> >>
>> >> I think I might have asked this before, sorry, but: when is that true
>> >> and why does it matter?
>> >
>> > can_store_by_pieces may be called:
>> >
>> > value-prof.c:      if (!can_store_by_pieces (val, builtin_memset_read_str,
>> > value-prof.c:      if (!can_store_by_pieces (val, builtin_memset_read_str,
>> >
>> > before virtual registers can be used.   When true is passed to
>> > widest_fixed_size_mode_for_size,  virtual registers may be used
>> > to expand memset to broadcast, which leads to ICE.   Since for the
>> > purpose of can_store_by_pieces, we don't need to expand memset
>> > to broadcast and pass false here can avoid ICE.
>>
>> Ah, I see, thanks.
>>
>> That sounds like a problem in the way that the memset const function is
>> written though.  can_store_by_pieces is just a query function, so I don't
>> think it should be trying to create new registers for can_store_by_pieces,
>> even if it could.  At the same time, can_store_by_pieces should make the
>> same choices as the real expander would.
>>
>> I think this means that:
>>
>> - gen_memset_broadcast should be inlined into its callers, with the
>>   builtin_memset_read_str getting the CONST_INT_P case and
>>   builtin_memset_gen_str getting the variable case.
>>
>> - builtin_memset_read_str should then stop at and return the
>>   gen_const_vec_duplicate when the prev argument is null.
>>   Only when prev is nonnull should it go on to call the hook
>>   and copy the constant to the register that the hook returns.
>
> How about keeping gen_memset_broadcast and passing PREV to it:
>
>   rtx target;
>   if (CONST_INT_P (data))
>     {
>       rtx const_vec = gen_const_vec_duplicate (mode, data);
>       if (prev == NULL)
>         /* Return CONST_VECTOR when called by a query function.  */
>         target = const_vec;
>       else
>         {
>           /* Use the move expander with CONST_VECTOR.  */
>           target = targetm.gen_memset_scratch_rtx (mode);
>           emit_move_insn (target, const_vec);
>         }
>     }
>   else
>     {
>       target = targetm.gen_memset_scratch_rtx (mode);
>       class expand_operand ops[2];
>       create_output_operand (&ops[0], target, mode);
>       create_input_operand (&ops[1], data, QImode);
>       expand_insn (icode, 2, ops);
>       if (!rtx_equal_p (target, ops[0].value))
>         emit_move_insn (target, ops[0].value);
>     }

TBH I think that complicates the interface too much.  The constant
and non-constant cases are now very different.

Thanks,
Richard

Reply via email to