------- Comment #33 from rguenth at gcc dot gnu dot org 2009-10-18 13:22 ------- It looks like basic-block frequencies are completely off. The BB in question is
# BLOCK 7 freq:3 # PRED: 6 [100.0%] (fallthru,exec) 7 [99.0%] (false,exec) # ivtmp.65_38 = PHI <ivtmp.65_113(6), ivtmp.65_129(7)> # ivtmp.68_147 = PHI <ivtmp.68_151(6), ivtmp.68_148(7)> D.1360_26 = MEM[index: ivtmp.65_38]; D.1404_30 = pow (D.1360_26, 1.5e+0); MEM[index: ivtmp.68_147] = D.1404_30; ivtmp.65_129 = ivtmp.65_38 + 1200; ivtmp.68_148 = ivtmp.68_147 + 1200; if (ivtmp.77_32 == ivtmp.65_129) goto <bb 8>; else goto <bb 7>; # SUCC: 8 [1.0%] (true,exec) 7 [99.0%] (false,exec) And 3 is lower than 11, the minimum frequency a BB is considered not cold. Predictions for bb 7 DS theory heuristics (ignored): 0.1% first match heuristics: 1.0% combined heuristics: 1.0% opcode values nonequal (on trees) heuristics (ignored): 28.0% loop branch heuristics (ignored): 14.0% guessed loop iterations heuristics: 1.0% but I see most blocks do not have a frequency at all and I also see # BLOCK 17 freq:10000 # PRED: 16 [100.0%] (fallthru,exec) 17 [99.0%] (false,exec) # ivtmp.16_116 = PHI <ivtmp.16_125(16), ivtmp.16_115(17)> MEM[index: ivtmp.16_116] = dtd_56(D); ivtmp.16_115 = ivtmp.16_116 + 1200; if (ivtmp.27_12 == ivtmp.16_115) goto <bb 18>; else goto <bb 17>; # SUCC: 18 [1.0%] (true,exec) 17 [99.0%] (false,exec) which is the block with the highest frequency (the innermost loop of the 2nd nest). I can imagine that with a lot of inlining and exposing very deep nested loops alongside very hot not-so-deep loops can cause them to become artificially cold. Interestingly the outermost loop blocks do not have any frequency assigned (that probably means zero). -- rguenth at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|0000-00-00 00:00:00 |2009-10-18 13:22:22 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40106