------- Comment #7 from rguenth at gcc dot gnu dot org  2009-09-30 08:59 -------
Hm, on x86_64 with -O3 -funroll-loops I see

f:
.LFB0:
        .cfi_startproc
        xorl    %eax, %eax
        .p2align 4,,10
        .p2align 3
.L2:
        movdqu  (%rsi,%rax), %xmm7
        movdqu  %xmm7, (%rdi,%rax)
        movdqu  16(%rsi,%rax), %xmm6
        movdqu  %xmm6, 16(%rdi,%rax)
        movdqu  32(%rsi,%rax), %xmm5
        movdqu  %xmm5, 32(%rdi,%rax)
        movdqu  48(%rsi,%rax), %xmm4
        movdqu  %xmm4, 48(%rdi,%rax)
        movdqu  64(%rsi,%rax), %xmm3
        movdqu  %xmm3, 64(%rdi,%rax)
        movdqu  80(%rsi,%rax), %xmm2
        movdqu  %xmm2, 80(%rdi,%rax)
        movdqu  96(%rsi,%rax), %xmm1
        movdqu  %xmm1, 96(%rdi,%rax)
        movdqu  112(%rsi,%rax), %xmm0
        movdqu  %xmm0, 112(%rdi,%rax)
        subq    $-128, %rax
        cmpq    $4096, %rax
        jne     .L2
        rep
        ret

which looks pretty optimal to me ...

If you disable vectorization then restrict doesn't make a difference anymore
because of TARGET_MEM_REFs:

<bb 3>:
  # i_15 = PHI <i_10(3), 0(2)>
  D.2707_9 = MEM[base: s_7(D), index: i_15, step: 4];
  MEM[base: t_4(D), index: i_15, step: 4] = D.2707_9;
  i_10 = i_15 + 1;
  if (i_10 != 1024)
    goto <bb 3>;

The MEM_EXPRs we get from expansion do not have points-to information
anymore.  It's possible to fix that, though it might be not without
losses elsewhere (TMRs suck).

Well, I'll try to have a look here during stage3.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rguenth at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2009-09-29 21:18:52         |2009-09-30 08:59:39
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22031

Reply via email to