On Mon, Jun 8, 2020 at 10:37 AM Eric Botcazou <botca...@adacore.com> wrote:
>
> > OK, but all I was saying is that looking at the pointed-to type of
> > the address argument to the memcpy looks fragile to me.  For the
> > case cited above it would be better to look at the actual
> > reference we are looking up and not allowing memcpy handling
> > when that reference contains a storage order barrier?  Like
> > a && !contains_storage_order_barrier_p (vr->operands) on the whole block
> > like we do for the assign from constant/SSA value case?
>
> The memcpy is the storage order barrier though, there is no other, so I'm not
> sure what contains_storage_order_barrier_p should be called on.  Or do you
> mean that an operand should be created for the memcpy with "reverse" set?

I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER
works and what the constraints are.  But if the memcpy is the storage order
barrier then what cases are OK?

 - both source and destination objects are [only] accessed with the same
   storage order
 - both source and destination objects are [only] accessed with
   !TYPE_REVERSE_STORAGE_ORDER "containing" accesses

and what determines the "object state" with respect to storage order
when we see a memcpy call?  I had the impression (from the other
existing cases of contains_storage_order_barrier_p () calls) that
we derive the "object state" from the downstream access (vr->operands),
in this case the memcpy destination access.  But see below for maybe
the easier case (memcpy folding).

> > But the above cited case is the _only_ case we're possibly building an
> > assignment of a variable - so it's enough to assert that in the above case
> > destvar is not TYPE_REVERSE_STORAGE_ORDER?
>
> What about
>
>       /* If we can perform the copy efficiently with first doing all loads
>          and then all stores inline it that way.  Currently efficiently
>          means that we can load all the memory into a single integer
>          register which is what MOVE_MAX gives us.  */
>       src_align = get_pointer_alignment (src);
>
> It's also a scalar copy, so it won't work either.

In my understanding of how TYPE_REVERSE_STORAGE_ORDER works
a memcpy, irrespective of its actual "implementation" should not break
anything since it transfers bytes.  In particular a 8 byte
TYPE_REVERSE_STORAGE_ORDER
object can be transfered ("memcpy"ed) with (ignoring TBAA)

  int tema = *(int *)src;
  *(int *)dst = tema;
  int temb = *((int *)src + 1);
  *((int *)dst + 1) = temb;

and what byteorder the int accesses are performed in is irrelevant
as long as read and write use the same byteorder.

> > The other cases all build an aggregate copy assignment with
> > a non-reverse-storage-order char[size] type which should be OK, no?
>
> Yes, theoritically, but SRA rewrites the access into a scalar access and we're
> back to square one (I tried).

But what's the issue with this?

> , so this can possibly work only if the array type
> is built with TYPE_REVERSE_STORAGE_ORDER too, which implies that the source is
> in reverse order too.  So the only case that could be handled correctly is the
> memcpy between two aggregates types with TYPE_REVERSE_STORAGE_ORDER.

Why?  Is the following an invalid program?

void mymemcpy (void *a, void *b) { memcpy (a, b, 8); }
void foo()
{
  int b[2] __attribute__(reverse_storage);
  int a[2];
  mymemcpy (a, b);
  return a[0];
}

how can GCC infer, inside mymemcpy that b is reverse storage order?

Thus, if all this is a problem then I think we're papering over the issue
by not folding memcpy.

> > Note even for the above we probably would be fine if we'd made sure
> > to use a !TYPE_REVERSE_STORAGE_ORDER "qualified" type
> > for the access.  So the memcpy happens via a native order
> > load plus a native order store, exactly what memcpy does (in fact
> > the actual order does not matter unless the source and destination
> > order "agree")?  Thus maybe amend
> >
> >       /* Make sure we are not copying using a floating-point mode or
> >          a type whose size possibly does not match its precision.  */
> >       if (FLOAT_MODE_P (TYPE_MODE (desttype))
> >
> >           || TREE_CODE (desttype) == BOOLEAN_TYPE
> >           || TREE_CODE (desttype) == ENUMERAL_TYPE)
> >
> >         desttype = bitwise_type_for_mode (TYPE_MODE (desttype));
> >       if (FLOAT_MODE_P (TYPE_MODE (srctype) )
> >
> >           || TREE_CODE (srctype) == BOOLEAN_TYPE
> >           || TREE_CODE (srctype) == ENUMERAL_TYPE)
> >
> >         srctype = bitwise_type_for_mode (TYPE_MODE (srctype));
> >
> > with || TYPE_REVERSE_STORAGE_ORDER (srctype) (same for desttype)?
>
> We don't have TYPE_REVERSE_STORAGE_ORDER qualified types, the flag is only set
> on aggregate types and bitwise_type_for_mode will return an integer type.

Exactly.  When the original src/desttype turns out to be a RECORD_TYPE (IIRC
that can happen) with TYPE_REVERSE_STORAGE_ORDER make sure we're
using a bitwise_type_for_mode.  Because then the copying will happen in
chunks of, say, SImode which should be fine (see above).

Richard.

>
> --
> Eric Botcazou

Reply via email to