On Thu, Dec 8, 2011 at 12:34 PM, Bingfeng Mei <b...@broadcom.com> wrote: > Hi, > I experienced a code generation bug with 4.5 (yes, our > port is still stuck at 4.5.4). Since the concerned code > is full of our target-specific code, it is not easy > to demonstrate the error with x86 or ARM. > > Here is what happens in expanding process. The following is a > piece of optimized tree code to be expanded to RTL. > > # ptr_h2_493 = PHI <ptr_h2_310(30), ptr_hf_465(29)> > ... > D.13598_218 = MEM[base: ptr_h2_493, offset: 8]; > D.13599_219 = (long int) D.13598_218; > ... > ptr_h2_310 = ptr_h2_493 + 16; > ... > D.13634_331 = D.13599_219 * D.13538_179; > cor3_332 = D.13635_339 + D.13634_331; > ... > > When expanding to RTL, the coalescing algorithm will coalesce > ptr_h2_310 & ptr_h2_493 to one register: > > ;; ptr_h2_310 = ptr_h2_493 + 16; > (insn 364 363 0 (set (reg/v/f:SI 282 [ ptr_h2 ]) > (plus:SI (reg/v/f:SI 282 [ ptr_h2 ]) > (const_int 16 [0x10]))) -1 (nil)) > > GCC 4.5 (fp_gcc 2.3.x) doesn't expand statements one-by-one > as GCC 4.4 (fp_gcc 2.2.x) does. So when GCC expands the > following statement, > > cor3_332 = D.13635_339 + D.13634_331; > > it then in turn expands each operand by going back to > expand previous relevant statements. > > D.13598_218 = MEM[base: ptr_h2_493, offset: 8]; > D.13599_219 = (long int) D.13598_218; > ... > D.13634_331 = D.13599_219 * D.13538_179; > > The problem is that compiler doesn't take account into fact that > ptr_h2_493|ptr_h2_310 has been modified. Still expand the above > statement as it is. > > (insn 380 379 381 (set (reg:HI 558) > (mem:HI (plus:SI (reg/v/f:SI 282 [ ptr_h2 ]) > (const_int 8 [0x8])) [0 S2 A8])) -1 (nil)) > ... > (insn 382 381 383 (set (reg:SI 557) > (mult:SI (sign_extend:SI (reg:HI 558)) > (sign_extend:SI (reg:HI 559)))) -1 (nil)) > > This seems to me quite a basic issue. I cannot believe testsuites > and other applications do not expose more errors. > > What I am not sure is whether the coalescing algorithm or the expanding > procedure is wrong here. If ptr_h2_493 and ptr_h2_310 are not coalesced > to use the same register, it should be correctly compiled. Or expanding > procedure checks data flow, it should be also OK. Which one should I > I look at? Or is this a known issue and fixed in 4.6/4.7?
TER should not happen for D.13598_218 = MEM[base: ptr_h2_493, offset: 8]; because it conflicts with the coalesce. Thus, -fno-tree-ter should fix your issue. You may look at the -fdump-rtl-expand-details dump to learn about the coalescing decisions. I'm not sure we fixed a bug that looks like the above. With 4.5 the 'MEM' is a TARGET_MEM_REF tree. Micha should be most familiar with evolutions in this code. Richard. > Thanks, > Bingfeng Mei >