On Tue, Oct 2, 2012 at 4:19 PM, Walter Lee <w...@tilera.com> wrote:
>
> On TILE-Gx, I'm observing a degradation in inlined memcpy/memset in
> gcc 4.6 and later versus gcc 4.4.  Though I find the problem on
> TILE-Gx, I think this is a problem for any architectures with
> SLOW_UNALIGNED_ACCESS set to 1.
>
> Consider the following program:
>
> struct foo {
>   int x;
> };
>
> void copy(struct foo* f0, struct foo* f1)
> {
>   memcpy (f0, f1, sizeof(struct foo));
> }
>
> In gcc 4.4, I get the desired inline memcpy:
>
> copy:
>         ld4s    r1, r1
>         st4     r0, r1
>         jrp     lr
>
> In gcc 4.7, however, I get inlined byte-by-byte copies:
>
> copy:
>         ld1u_add r10, r1, 1
>         st1_add  r0, r10, 1
>         ld1u_add r10, r1, 1
>         st1_add  r0, r10, 1
>         ld1u_add r10, r1, 1
>         st1_add  r0, r10, 1
>         ld1u     r10, r1
>         st1      r0, r10
>         jrp      lr
>
> The inlining of memcpy is done in expand_builtin_memcpy in builtins.c.
> Tracing through that, I see that the alignment of src_align and
> dest_align, which is computed by get_pointer_alignment, has degraded:
> in gcc 4.4 they are 32 bits, but in gcc 4.7 they are 8 bits.  This
> causes the loads generated by the inlined memcopy to be per-byte
> instead of per-4-byte.
>
> Looking further, gcc 4.7 uses the "align" field in "struct
> ptr_info_def" to compute the alignment.  This field appears to be
> initialized in get_ptr_info in tree-ssanames.c but it is always
> initialized to 1 byte and does not appear to change.  gcc 4.4 computes
> its alignment information differently.
>
> I get the same byte-copies with gcc 4.8 and gcc 4.6.
>
> I see a couple related open PRs: 50417, 53535, but no suggested fixes
> for them yet.  Can anyone advise on how this can be fixed?  Should I
> file a new bug, or add this info to one of the existing PRs?
There is no way to fix it.  memcpy does not require aligned arguments
and the merely presence of a typed pointer contains zero information
of alignment for the middle-end.  If you want to excercise C's notion
of alignemnt requirements then do not use memcpy but

 *f0 = *f1;

which works equally well.  ISTR Tom had various patches improving
alignment info for pointers but we cannot improve this degenerate
case.  Btw, the new beavior even fixed bugs.

Richard.

> Thanks,
>
> Walter
>

Reply via email to