On Mon, Aug 8, 2011 at 10:11 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Mon, Aug 8, 2011 at 5:30 PM, Ulrich Weigand <uweig...@de.ibm.com> wrote: >> Uros Bizjak wrote: >> >>> Although, it would be nice for reload to subsequently fix CSE'd >>> non-offsetable memory by copying address to temporary reg (*as said in >>> the documentation*), we could simply require an XMM temporary for >>> TImode reloads to/from integer registers, and this fixes ICE for x32. >> >> Moves are special as far as reload is concerned. If there is already >> a move instruction present *before* reload, it will get fixed up >> according to its constraints as any other instruction. >> >> However, reload will *introduce* new moves as part of its operation, >> and those will *not* themselves get reloaded. Instead, reload simply >> assumes that every plain move will just succeed without requiring >> any reload; if this is not true, the target *must* provide a >> secondary reload for this move. >> >> (Note that the secondary reload could also work by reloading the >> target address into a temporary; that's up to the target to >> implement.) > > Whoa, indeed. > > Using attached patch that reloads memory address instead of going > through XMM register, the code for the testcase improves from: > > test: > .LFB0: > .cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > sall $4, %esi > addl %edi, %esi > movdqa (%esi), %xmm0 > movdqa %xmm0, -16(%rsp) > movq -16(%rsp), %rcx > movq -8(%rsp), %rbx > addq $1, %rcx > adcq $0, %rbx > movq %rcx, -16(%rsp) > sall $4, %edx > movq %rbx, -8(%rsp) > movdqa -16(%rsp), %xmm0 > movdqa %xmm0, (%esi) > pxor %xmm0, %xmm0 > movdqa %xmm0, (%edx,%esi) > popq %rbx > .cfi_def_cfa_offset 8 > ret > > to: > > test: > .LFB0: > .cfi_startproc > sall $4, %esi > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > addl %edi, %esi > pxor %xmm0, %xmm0 > mov %esi, %eax > movq (%rax), %rcx > movq 8(%rax), %rbx > addq $1, %rcx > adcq $0, %rbx > sall $4, %edx > movq %rcx, (%rax) > movq %rbx, 8(%rax) > movdqa %xmm0, (%edx,%esi) > popq %rbx > .cfi_def_cfa_offset 8 > ret > > H.J., can you please test attached patch? This optimization won't > trigger on x86_64 anymore. >
I will test it. Thanks. -- H.J.