On Mon, Jun 8, 2020 at 10:37 AM Eric Botcazou <botca...@adacore.com> wrote: > > > OK, but all I was saying is that looking at the pointed-to type of > > the address argument to the memcpy looks fragile to me. For the > > case cited above it would be better to look at the actual > > reference we are looking up and not allowing memcpy handling > > when that reference contains a storage order barrier? Like > > a && !contains_storage_order_barrier_p (vr->operands) on the whole block > > like we do for the assign from constant/SSA value case? > > The memcpy is the storage order barrier though, there is no other, so I'm not > sure what contains_storage_order_barrier_p should be called on. Or do you > mean that an operand should be created for the memcpy with "reverse" set?
I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER works and what the constraints are. But if the memcpy is the storage order barrier then what cases are OK? - both source and destination objects are [only] accessed with the same storage order - both source and destination objects are [only] accessed with !TYPE_REVERSE_STORAGE_ORDER "containing" accesses and what determines the "object state" with respect to storage order when we see a memcpy call? I had the impression (from the other existing cases of contains_storage_order_barrier_p () calls) that we derive the "object state" from the downstream access (vr->operands), in this case the memcpy destination access. But see below for maybe the easier case (memcpy folding). > > But the above cited case is the _only_ case we're possibly building an > > assignment of a variable - so it's enough to assert that in the above case > > destvar is not TYPE_REVERSE_STORAGE_ORDER? > > What about > > /* If we can perform the copy efficiently with first doing all loads > and then all stores inline it that way. Currently efficiently > means that we can load all the memory into a single integer > register which is what MOVE_MAX gives us. */ > src_align = get_pointer_alignment (src); > > It's also a scalar copy, so it won't work either. In my understanding of how TYPE_REVERSE_STORAGE_ORDER works a memcpy, irrespective of its actual "implementation" should not break anything since it transfers bytes. In particular a 8 byte TYPE_REVERSE_STORAGE_ORDER object can be transfered ("memcpy"ed) with (ignoring TBAA) int tema = *(int *)src; *(int *)dst = tema; int temb = *((int *)src + 1); *((int *)dst + 1) = temb; and what byteorder the int accesses are performed in is irrelevant as long as read and write use the same byteorder. > > The other cases all build an aggregate copy assignment with > > a non-reverse-storage-order char[size] type which should be OK, no? > > Yes, theoritically, but SRA rewrites the access into a scalar access and we're > back to square one (I tried). But what's the issue with this? > , so this can possibly work only if the array type > is built with TYPE_REVERSE_STORAGE_ORDER too, which implies that the source is > in reverse order too. So the only case that could be handled correctly is the > memcpy between two aggregates types with TYPE_REVERSE_STORAGE_ORDER. Why? Is the following an invalid program? void mymemcpy (void *a, void *b) { memcpy (a, b, 8); } void foo() { int b[2] __attribute__(reverse_storage); int a[2]; mymemcpy (a, b); return a[0]; } how can GCC infer, inside mymemcpy that b is reverse storage order? Thus, if all this is a problem then I think we're papering over the issue by not folding memcpy. > > Note even for the above we probably would be fine if we'd made sure > > to use a !TYPE_REVERSE_STORAGE_ORDER "qualified" type > > for the access. So the memcpy happens via a native order > > load plus a native order store, exactly what memcpy does (in fact > > the actual order does not matter unless the source and destination > > order "agree")? Thus maybe amend > > > > /* Make sure we are not copying using a floating-point mode or > > a type whose size possibly does not match its precision. */ > > if (FLOAT_MODE_P (TYPE_MODE (desttype)) > > > > || TREE_CODE (desttype) == BOOLEAN_TYPE > > || TREE_CODE (desttype) == ENUMERAL_TYPE) > > > > desttype = bitwise_type_for_mode (TYPE_MODE (desttype)); > > if (FLOAT_MODE_P (TYPE_MODE (srctype) ) > > > > || TREE_CODE (srctype) == BOOLEAN_TYPE > > || TREE_CODE (srctype) == ENUMERAL_TYPE) > > > > srctype = bitwise_type_for_mode (TYPE_MODE (srctype)); > > > > with || TYPE_REVERSE_STORAGE_ORDER (srctype) (same for desttype)? > > We don't have TYPE_REVERSE_STORAGE_ORDER qualified types, the flag is only set > on aggregate types and bitwise_type_for_mode will return an integer type. Exactly. When the original src/desttype turns out to be a RECORD_TYPE (IIRC that can happen) with TYPE_REVERSE_STORAGE_ORDER make sure we're using a bitwise_type_for_mode. Because then the copying will happen in chunks of, say, SImode which should be fine (see above). Richard. > > -- > Eric Botcazou