On Tue, Apr 03, 2018 at 02:30:23PM +0200, Richard Biener wrote:
> On Tue, 3 Apr 2018, Alan Modra wrote:
> 
> > On Tue, Apr 03, 2018 at 01:01:23PM +0200, Richard Biener wrote:
> > > On Fri, 30 Mar 2018, Peter Bergner wrote:
> > > 
> > > > On 3/29/18 9:35 AM, Alan Modra wrote:
> > > > > On Thu, Nov 16, 2017 at 03:27:01PM +0000, Tamar Christina wrote:
> > > > >> --- a/gcc/expr.c
> > > > >> +++ b/gcc/expr.c
> > > > >> @@ -2769,7 +2769,9 @@ copy_blkmode_to_reg (machine_mode mode, tree 
> > > > >> src)
> > > > >>  
> > > > >>    n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
> > > > >>    dst_words = XALLOCAVEC (rtx, n_regs);
> > > > >> -  bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
> > > > >> +  bitsize = BITS_PER_WORD;
> > > > >> +  if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN 
> > > > >> (TREE_TYPE (src))))
> > > > >> +    bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
> > > > >>  
> > > > >>    /* Copy the structure BITSIZE bits at a time.  */
> > > > >>    for (bitpos = 0, xbitpos = padding_correction;
> > > > > 
> > > > > I believe this patch is wrong.  Please revert.  See the PR84762 
> > > > > testcase.
> > > > > 
> > > > > There are two problems.  Firstly, if padding_correction is non-zero,
> > > > > then xbitpos % BITS_PER_WORD is non-zero and in
> > > > > 
> > > > >       store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD,
> > > > >                      0, 0, word_mode,
> > > > >                      extract_bit_field (src_word, bitsize,
> > > > >                                         bitpos % BITS_PER_WORD, 1,
> > > > >                                         NULL_RTX, word_mode, 
> > > > > word_mode,
> > > > >                                         false, NULL),
> > > > >                      false);
> > > > > 
> > > > > the stored bit-field exceeds the destination register size.  You could
> > > > > fix that by making bitsize the gcd of bitsize and padding_correction.
> > > > > 
> > > > > However, that doesn't fix the second problem which is that the
> > > > > extracted bit-field can exceed the source size.  That will result in
> > > > > rubbish being read into a register.
> > > > 
> > > > FYI, I received an automated response saying Tamar is away on vacation
> > > > with no return date specified.  That means he won't be able to revert
> > > > the patch.  What do we do now?
> > > 
> > > The code before the change already looks fishy to me.
> > 
> > Yes, it is fishy so far as the code in the loop relies on alignment
> > determining a small enough bitsize.  Adding
> > 
> >   if (bytes % UNITS_PER_WORD != 0)
> >     bitsize = gcd (bitsize, (bytes % UNITS_PER_WORD) * BITS_PER_UNIT);
> > 
> > after bitsize is calculated from alignment would make the code
> > correct, I believe.  But I think that will fail the testcase Tamar
> > added.
> > 
> > >   x = expand_normal (src);
> > > 
> > > so what do we expect this to expand to in general?  Fortunately
> > > it seems there are exactly two callers so hopefully a
> > > gcc_assert (MEM_P (x)) would work?
> > > 
> > > The fishy part is looking at TYPE_ALIGN (TREE_TYPE (src)).
> > > 
> > > In case x is a MEM we should use MEM_ALIGN and if not then we
> > > shouldn't do anything but just use word_mode here.
> > 
> > No!  You can't use a bitsize of BITS_PER_WORD here.  See the code I
> > quoted above.
> 
> Ah, so it should possibly use TYPE_SIZE instead of TYPE_ALIGN

Not that either.  We already have a byte count..

> otherwise
> it will handle the over-aligned case (align to 8 bytes, size 4 bytes)
> incorrectly as well?

Yes, that case is mishandled too, even before Tamar's patch.  Peter's
pr85123 testcase with vfloat1 aligned(8) is an example.

If we want correct code, then

  if (bytes % UNITS_PER_WORD != 0)
    bitsize = gcd (bitsize, (bytes % UNITS_PER_WORD) * BITS_PER_UNIT);

will fix the algorithm inside the copy_blkmode_to_reg loop.  Otherwise
the loop itself needs changes.

-- 
Alan Modra
Australia Development Lab, IBM

Reply via email to