Re: [PATCH] Add QI vector mode support to by-pieces for memset

Richard Sandiford via Gcc-patches Mon, 19 Jul 2021 23:38:37 -0700

"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> > +     {
>> > +       /* First generate subreg of word mode if the previous mode is
>> > +          wider than word mode and word mode is wider than MODE.  */
>> > +       prev_rtx = simplify_gen_subreg (word_mode, prev_rtx,
>> > +                                       prev_mode, 0);
>> > +       prev_mode = word_mode;
>> > +     }
>> > +      if (prev_rtx != nullptr)
>> > +     target = simplify_gen_subreg (mode, prev_rtx, prev_mode, 0);
>>
>> This should be lowpart_subreg, since 0 isn't the right offset for
>> big-endian targets.  Using lowpart_subreg should also avoid the need
>> for the word_size “if” above: lowpart_subreg can handle lowpart subword
>> subregs of multiword values.
>
> I tried it.  It didn't work since it caused the LRA failure.   I replaced
> simplify_gen_subreg with lowpart_subreg instead.


What specifically went wrong?

>> > +/* Callback routine for store_by_pieces.  Read GET_MODE_BITSIZE (MODE)
>> > +   bytes from constant string DATA + OFFSET and return it as target
>> > +   constant.  If PREV isn't nullptr, it has the RTL info from the
>> > +   previous iteration.  */
>> > +
>> > +rtx
>> > +builtin_memset_read_str (void *data, void *prev,
>> > +                      HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
>> > +                      machine_mode mode)
>> > +{
>> > +  rtx target;
>> >    const char *c = (const char *) data;
>> > -  char *p = XALLOCAVEC (char, GET_MODE_SIZE (mode));
>> > +  char *p;
>> > +  unsigned int size = GET_MODE_SIZE (mode).to_constant ();
>> > +
>> > +  /* Don't use the previous value if size is 1.  */
>>
>> Why not though?  The existing code does, and that seems like the right
>> thing to do when operating on integers.
>>
>> I can see the check would be a good thing to do if MODE isn't a vector
>> mode and the previous mode was.  Is that the case you're worried about?
>> If so, that check could go in gen_memset_value_from_prev instead.
>
> We are storing one byte.  Doing it directly is faster.

But the first thing being protected here is…

>> > +  if (size != 1)
>> > +    {
>> > +      target = gen_memset_value_from_prev (prev, mode);
>> > +      if (target != nullptr)
>> > +     return target;

…this attempt to use the previous value.  If the target uses, say,
SImode for the first piece and QImode for a final byte, using the QImode
lowpart of the SImode register would avoid having to move the byte value
into a separate QImode register.  Why's that a bad thing to do?  AFAICT
it's what the current code would do, so if we want to change it even for
integer modes, I think it should be a separate patch with a separate
justification.

Like I say, I can understand that using the QImode lowpart of a vector
wouldn't be a good idea.  But if that's specifically what you're trying
to prevent, I think we should test for it.

Thanks,
Richard

Re: [PATCH] Add QI vector mode support to by-pieces for memset

Reply via email to