Hi, Ramana, I tried the trunk version with/without your patch. It still produces the same code as gcc4.2.2 does. In auto-inc-dec.c, the comments say
*a ... a <- a + c becomes *(a += c) post But the problem is after Tree-SSA pass, there is no a <- a + c But something like a_1 <- a + c Unless the auto-inc-dec.c can reverse a_1 <- a + c to a <- a + c. I don't see this transformation is applicable in most scenarios. Any comments? Cheers, Bingfeng -----Original Message----- From: Ramana Radhakrishnan [mailto:[EMAIL PROTECTED] Sent: 02 November 2007 12:39 To: Bingfeng Mei Cc: gcc@gcc.gnu.org Subject: Re: Tree-SSA and POST_INC address mode inompatible in GCC4? Hi Bingfeng, On 11/2/07, Bingfeng Mei <[EMAIL PROTECTED]> wrote: > Hello, > > I look at the following the code to see what is the difference between > GCC4 and GCC3 in using POST_INC address mode (or other similar modes). > > void tst(char * __restrict__ a, char * __restrict__ b){ > *a++ = *b++; > *a++ = *b++; > *a++ = *b++; > *a++ = *b++; > *a++ = *b++; > *a++ = *b++; > *a = *b; > } We have seen this in a number of other ports as well - I had hacked up a patch to sort this precise problem out but that was for trunk / 4.3 and is not applicable for 4.2.x since the autoincrement detector was rewritten post 4.2. http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01060.html I haven't yet had time to rework this based on the comments but it surely is on my radar of things to do. cheers Ramana > > > Using ARM processor as a target, GCC4.2.2 generates the following > assembly: > tst: > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > mov r2, r1 > ldrb ip, [r2], #1 @ zero_extendqisi2 > mov r3, r0 > strb ip, [r3], #1 > ldrb r1, [r1, #1] @ zero_extendqisi2 > strb r1, [r0, #1] > ldrb r1, [r2, #1] @ zero_extendqisi2 > strb r1, [r3, #1] > add r2, r2, #1 > ldrb r1, [r2, #1] @ zero_extendqisi2 > add r3, r3, #1 > strb r1, [r3, #1] > add r2, r2, #1 > ldrb r1, [r2, #1] @ zero_extendqisi2 > add r3, r3, #1 > strb r1, [r3, #1] > add r2, r2, #1 > ldrb r1, [r2, #1] @ zero_extendqisi2 > add r3, r3, #1 > strb r1, [r3, #1] > ldrb r2, [r2, #2] @ zero_extendqisi2 > @ lr needed for prologue > strb r2, [r3, #2] > bx lr > .size tst, .-tst > .ident "GCC: (GNU) 4.2.2" > > And GCC3.4.6 generates much better code by using POST_INC address mode > extensively > > tst: > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > ldrb r3, [r1], #1 @ zero_extendqisi2 > strb r3, [r0], #1 > ldrb r3, [r1], #1 @ zero_extendqisi2 > strb r3, [r0], #1 > ldrb r3, [r1], #1 @ zero_extendqisi2 > strb r3, [r0], #1 > ldrb r3, [r1], #1 @ zero_extendqisi2 > strb r3, [r0], #1 > ldrb r3, [r1], #1 @ zero_extendqisi2 > strb r3, [r0], #1 > ldrb r3, [r1], #1 @ zero_extendqisi2 > strb r3, [r0], #1 > ldrb r3, [r1, #0] @ zero_extendqisi2 > @ lr needed for prologue > strb r3, [r0, #0] > mov pc, lr > .size tst, .-tst > .ident "GCC: (GNU) 3.4.6" > > I look at dumped tst.c.102t.final_cleanup: > tst (a, b) > { > char * restrict a.54; > char * restrict a.53; > char * restrict a.52; > char * restrict a.51; > char * restrict a.50; > char * restrict b.48; > char * restrict b.47; > char * restrict b.46; > char * restrict b.45; > char * restrict b.44; > > <bb 2>: > *a = *b; > a.50 = a + 1B; > b.44 = b + 1B; > *a.50 = *b.44; > a.51 = a.50 + 1B; > b.45 = b.44 + 1B; > *a.51 = *b.45; > a.52 = a.51 + 1B; > b.46 = b.45 + 1B; > *a.52 = *b.46; > a.53 = a.52 + 1B; > b.47 = b.46 + 1B; > *a.53 = *b.47; > a.54 = a.53 + 1B; > b.48 = b.47 + 1B; > *a.54 = *b.48; > *(a.54 + 1B) = *(b.48 + 1B); > return; > > } > I believe it is a fundermental issue for Tree-SSA IR. POST_INC address > mode requires a pattern that the same variable is used for incrementing > (both USE and DEF), while the SSA form produces a different varible for > each DEF. Therefore, GCC4 cannot efficiently use POST_INC and other > similar address modes. Is there any solution to overcome this problem? > Any suggestion is greatly appreciated. > > > Bingfeng Mei > Broadcom UK > > -- Ramana Radhakrishnan GNU Tools Celunite Inc.