http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
William J. Schmidt <wschmidt at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wschmidt at gcc dot gnu.org --- Comment #37 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2011-11-17 15:15:49 UTC --- Using Pat's reduced test case, I found that the problem occurs when the back arc along 7->7 (the innermost loop) is split by the following: Inserting a partition copy on edge BB7->BB7 :PART.153 = PART.45 Coalescing previously determined the following for these partitions: Partition 45 (prephitmp.35_113 - 113 222 225 343 ) Partition 153 (prephitmp.35_322 - 322 ) Below, I've cut down the tree dump at the time of coalescing to show just the statements involving prephitmp.35 and control flow, except for block 7 which I've reproduced in its entirety. The problem can be seen in block 7, where there are two PHIs with the identical RHS: # prephitmp.35_322 = PHI <prephitmp.35_59(6), prephitmp.35_113(7)> # prephitmp.35_225 = PHI <prephitmp.35_59(6), prephitmp.35_113(7)> 225 is coalesced with 113, but 322 is not. From what I can tell, 322 should be equally as good a candidate for coalescing as 225. If the duplicates had been removed, it seems the existing coalescing algorithm would have avoided inserting the partition copy that created the extra block. I'm thinking these kinds of duplicate phis should be cleaned up before we get to expand. Is that already the intent, and this is just a bug, or is that something that needs to be implemented somewhere in the late tree phases? Alternatively, should coalesce have done better here? For what it's worth, here are the origins of the duplicate PHIs in block 7. The first PHI is introduced in 094t.pre: # prephitmp.35_322 = PHI <zlvj.5_59(9), cikve.14_113(12)> This is changed as follows in 099t.copyprop4: # prephitmp.35_322 = PHI <zlvj.5_59(6), cikve.14_113(8)> # cikve_lsm.48_225 = PHI <zlvj.5_59(6), cikve.14_113(8)> Finally, these are renamed in 138t.copyrename4: # prephitmp.35_322 = PHI <prephitmp.35_59(6), prephitmp.35_113(7)> # prephitmp.35_225 = PHI <prephitmp.35_59(6), prephitmp.35_113(7)> ====================================================================== thin6d (integer(kind=4) & restrict nthinerr) { ... # BLOCK 5 freq:2800 # PRED: 4 [100.0%] (fallthru,exec) 9 [86.0%] (dfs_back,false,exec) # prephitmp.35_221 = PHI <prephitmp.35_284(4), prephitmp.35_228(9)> ... prephitmp.35_32 = D.2028_14 * pretmp.34_287 + D.2038_31; ... prephitmp.35_59 = D.2036_27 * pretmp.34_287 + D.2189_18; D.2050_70 = prephitmp.35_32 * pretmp.37_297 + pretmp.37_295; prephitmp.42_77 = prephitmp.35_59 * pretmp.37_299 + D.2050_70; D.2190_76 = -prephitmp.35_59; ... prephitmp.42_95 = prephitmp.35_32 * pretmp.37_299 + D.2057_88; if (nmz.1_3 > 2) goto <bb 6>; else goto <bb 9>; # SUCC: 6 [50.0%] (true,exec) 9 [50.0%] (false,exec) # BLOCK 6 freq:1400 # PRED: 5 [50.0%] (true,exec) ... # SUCC: 7 [100.0%] (fallthru,exec) # BLOCK 7 freq:10000 # PRED: 6 [100.0%] (fallthru,exec) 7 [86.0%] (dfs_back,false,exec) # prephitmp.35_321 = PHI <prephitmp.35_32(6), prephitmp.35_106(7)> # prephitmp.35_322 = PHI <prephitmp.35_59(6), prephitmp.35_113(7)> # prephitmp.42_325 = PHI <prephitmp.42_77(6), prephitmp.42_133(7)> # prephitmp.42_326 = PHI <prephitmp.42_95(6), prephitmp.42_152(7)> # prephitmp.35_229 = PHI <prephitmp.35_32(6), prephitmp.35_203(7)> # prephitmp.35_225 = PHI <prephitmp.35_59(6), prephitmp.35_113(7)> # ivtmp.56_220 = PHI <ivtmp.56_219(6), ivtmp.56_227(7)> # ivtmp.62_218 = PHI <ivtmp.62_290(6), ivtmp.62_217(7)> D.2067_105 = prephitmp.35_59 * prephitmp.35_322; D.2191_94 = -D.2067_105; prephitmp.35_106 = prephitmp.35_32 * prephitmp.35_321 + D.2191_94; D.2070_112 = prephitmp.35_32 * prephitmp.35_225; prephitmp.35_113 = prephitmp.35_59 * prephitmp.35_229 + D.2070_112; ivtmp.56_227 = ivtmp.56_220 + 8; D.2155_289 = (void *) ivtmp.56_227; D.2075_120 = MEM[base: D.2155_289, offset: 0B]; D.2078_124 = prephitmp.35_106 * D.2075_120 + prephitmp.42_325; ivtmp.62_217 = ivtmp.62_218 + 8; D.2156_283 = (void *) ivtmp.62_217; D.2079_130 = MEM[base: D.2156_283, offset: 0B]; prephitmp.42_133 = prephitmp.35_113 * D.2079_130 + D.2078_124; D.2192_132 = -prephitmp.35_113; D.2084_143 = D.2192_132 * D.2075_120 + prephitmp.42_326; prephitmp.42_152 = prephitmp.35_106 * D.2079_130 + D.2084_143; prephitmp.35_203 = prephitmp.35_106; if (ivtmp.56_227 == D.2161_30) goto <bb 8>; else goto <bb 7>; # SUCC: 8 [14.0%] (true,exec) 7 [86.0%] (dfs_back,false,exec) # BLOCK 8 freq:1400 # PRED: 7 [14.0%] (true,exec) ... # SUCC: 9 [100.0%] (fallthru,exec) # BLOCK 9 freq:2800 # PRED: 5 [50.0%] (false,exec) 8 [100.0%] (fallthru,exec) ... # prephitmp.35_223 = PHI <prephitmp.35_32(5), prephitmp.35_106(8)> # prephitmp.35_222 = PHI <prephitmp.35_59(5), prephitmp.35_113(8)> ... # prephitmp.35_228 = PHI <prephitmp.35_221(5), prephitmp.35_106(8)> ... if (ivtmp.74_156 == D.2188_100) goto <bb 10>; else goto <bb 5>; # SUCC: 10 [14.0%] (true,exec) 5 [86.0%] (dfs_back,false,exec) # BLOCK 10 freq:392 # PRED: 9 [14.0%] (true,exec) # prephitmp.35_343 = PHI <prephitmp.35_222(9)> # prephitmp.35_344 = PHI <prephitmp.35_223(9)> ... # prephitmp.35_347 = PHI <prephitmp.35_228(9)> ... # prephitmp.35_350 = PHI <prephitmp.35_59(9)> # prephitmp.35_351 = PHI <prephitmp.35_32(9)> ... xlvj = prephitmp.35_351; zlvj = prephitmp.35_350; ... crkve = prephitmp.35_344; cikve = prephitmp.35_343; ... crkveuk = prephitmp.35_347; goto <bb 3>; # SUCC: 3 [100.0%] (fallthru,exec) } ======================================================================