The branch has regressed in terms of memory and compile time. However, I think the regressions are localized.
I compiled tuples and [EMAIL PROTECTED] with --disable-checking --enable-gather-detailed-mem-stats. I ran cc1 and cc1plus over the preprocessed code of cc1-i-files, FF3D, SPEC2000, TRAMP3D, PR 12850, PR 15855 and PR 30089. I had to disable PRE and FRE on both trunk and tuples because trunk wouldn't bootstrap if I only disabled the same bits that are disabled in tuples. The attached is a comparison using the compile and memory figures emitted by -ftime-report. The "Before" column corresponds to [EMAIL PROTECTED], the "After" column corresponds to [EMAIL PROTECTED] The times and memory figures are aggregated over ALL the files in the collection I described above. Some preliminary notes: - The branch is 25secs slower than mainline (2%). Out of those 25 seconds, 15.31 seconds are in tree DSE, which is puzzling. The other 10 seconds seem to be going to RTL DSE1 (3 secs), tree CCP (2 secs) and various others more evenly distributed. In fact, if you take away tree DSE, the tree passes are faster on tuples. - Memory utilization is tricky to check. The difference is almost totally going to the gimplifier and RTL expansion (about 900Kb of the 1.3Mb difference). These two have trees and gimple coexisting for a short period of time, particularly in the gimplifier. However, the trees that are used in RTL expansion are built and destroyed on the spot, so that should not affect peak memory use, but the report doesn't let me distinguish that. - The rest of the memory utilization difference is mostly in inlining (240Kb) and SSA update (50Kb). I think the main focus points should be DSE and trying to get a good way of measuring the memory utilization differences. Jan, any suggestion? Thanks. Diego.
Time Memory Phase Before After % change Before After % change ^ garbage collectio 54.90 56.01 2.02% 0 0 -100.00% ^ callgraph constru 7.04 7.58 7.67% 320099 205154 -35.91% ^ callgraph optimiz 12.01 9.98 -16.90% 274447 269807 -1.69% ^ ipa reference 1.46 1.14 -21.92% 1159 1514 30.63% ^ cfg cleanup 19.91 23.85 19.79% 49290 49184 -0.22% ^ trivially dead co 5.53 4.97 -10.13% 0 0 -100.00% ^ df reaching defs 11.13 11.74 5.48% 0 0 -100.00% ^ df live regs 25.94 26.95 3.89% 0 0 -100.00% ^ df reg dead/unuse 9.11 9.31 2.20% 152248 153839 1.05% ^ alias analysis 9.08 8.70 -4.19% 229345 229018 -0.14% ^ register scan 1.86 1.87 0.54% 1266 2053 62.16% ^ preprocessing 37.94 37.10 -2.21% 777544 776711 -0.11% ^ parser 182.56 184.84 1.25%1546317815807728 2.23% ^ name lookup 37.38 37.96 1.55% 1535795 1562059 1.71% ^ inline heuristics 6.43 4.12 -35.93% 98582 99750 1.18% ^ integration 31.66 33.25 5.02% 2585635 2824731 9.25% ^ tree gimplify 35.01 36.41 4.00% 1984557 2770492 39.60% ^ tree eh 1.77 1.89 6.78% 98729 130155 31.83% ^ tree CFG construc 6.57 4.31 -34.40% 720404 376314 -47.76% ^ tree CFG cleanup 17.99 15.96 -11.28% 120599 126975 5.29% ^ tree VRP 24.91 25.19 1.12% 830490 808712 -2.62% ^ tree copy propaga 7.41 7.71 4.05% 72143 72407 0.37% ^ tree find ref. va 1.82 1.58 -13.19% 72720 86277 18.64% ^ tree PTA 9.56 9.44 -1.26% 99931 97924 -2.01% ^ tree alias analys 4.47 4.53 1.34% 68751 71085 3.39% ^ tree call clobber 0.42 0.59 40.48% 887 1084 22.21% ^ tree flow sensiti 0.26 0.34 30.77% 4119 4397 6.75% ^ tree memory parti 1.46 1.43 -2.05% 612 538 -12.09% ^ tree PHI insertio 0.72 0.87 20.83% 11050 14721 33.22% ^ tree SSA rewrite 8.68 8.78 1.15% 512512 562206 9.70% ^ tree SSA other 1.86 2.15 15.59% 0 1008 100.00% ^ tree SSA incremen 15.31 16.84 9.99% 241694 295323 22.19% ^ tree operand scan 17.85 17.92 0.39% 1191734 1181479 -0.86% ^ dominator optimiz 23.55 22.96 -2.51% 640181 539452 -15.73% ^ tree SRA 2.55 2.57 0.78% 61375 66672 8.63% ^ tree STORE-CCP 2.64 2.32 -12.12% 13339 13020 -2.39% ^ tree CCP 6.60 8.47 28.33% 65710 66783 1.63% ^ tree PHI const/co 0.57 0.94 64.91% 305 4796 100.00% ^ tree split crit e 2.11 2.00 -5.21% 197589 196116 -0.75% ^ tree reassociatio 4.18 1.98 -52.63% 27810 26479 -4.79% ^ tree code sinking 1.40 1.33 -5.00% 12805 11777 -8.03% ^ tree linearize ph 0.59 0.53 -10.17% 1116 1305 16.94% ^ tree forward prop 2.79 2.94 5.38% 75293 108939 44.69% ^ tree phiprop 0.19 0.13 -31.58% 3161 3244 2.63% ^ tree conservative 3.84 3.07 -20.05% 540 2686 100.00% ^ tree aggressive D 1.05 0.96 -8.57% 10 52 100.00% ^ tree DSE 47.94 63.25 31.94% 8894 8348 -6.14% ^ PHI merge 0.25 0.23 -8.00% 11257 11741 4.30% ^ tree loop bounds 1.25 1.24 -0.80% 19002 18986 -0.08% ^ loop invariant mo 1.69 2.12 25.44% 9010 9482 5.24% ^ tree canonical iv 0.73 0.62 -15.07% 13110 13182 0.55% ^ scev constant pro 1.20 1.11 -7.50% 27378 26964 -1.51% ^ complete unrollin 4.44 4.34 -2.25% 205782 207906 1.03% ^ tree iv optimizat 8.34 8.61 3.24% 358903 345112 -3.84% ^ tree loop init 1.20 1.35 12.50% 24900 26327 5.73% ^ tree copy headers 2.31 2.23 -3.46% 121190 125617 3.65% ^ tree SSA to norma 7.66 7.45 -2.74% 233190 288802 23.85% ^ tree NRV optimiza 0.05 0.06 20.00% 15 43 100.00% ^ tree rename SSA c 1.24 1.90 53.23% 44 51 15.91% ^ dominance computa 13.92 13.18 -5.32% 0 0 -100.00% ^ expand 76.00 75.85 -0.20% 1785602 1980584 10.92% ^ varconst 4.83 4.76 -1.45% 158261 163395 3.24% ^ jump 1.18 0.78 -33.90% 47174 46641 -1.13% ^ forward prop 6.06 5.84 -3.63% 141222 138674 -1.80% ^ CSE 36.16 36.98 2.27% 83407 83559 0.18% ^ dead store elim1 30.02 33.79 12.56% 84570 85554 1.16% ^ dead store elim2 9.24 9.13 -1.19% 86389 88339 2.26% ^ loop analysis 4.21 4.15 -1.43% 40865 40411 -1.11% ^ CPROP 1 3.19 3.25 1.88% 42484 42457 -0.06% ^ PRE 11.49 12.28 6.88% 53341 53737 0.74% ^ CPROP 2 7.36 7.95 8.02% 64351 63455 -1.39% ^ bypass jumps 6.29 6.70 6.52% 54337 55215 1.62% ^ CSE 2 12.44 13.43 7.96% 28327 28417 0.32% ^ branch prediction 5.45 5.43 -0.37% 101423 97966 -3.41% ^ combiner 33.79 33.95 0.47% 433896 430690 -0.74% ^ if-conversion 5.74 5.65 -1.57% 57406 57517 0.19% ^ regmove 7.19 7.13 -0.83% 7210 7239 0.40% ^ scheduling 25.83 26.31 1.86% 28054 28653 2.14% ^ local alloc 19.05 19.31 1.36% 127262 129019 1.38% ^ global alloc 38.84 38.31 -1.36% 171131 172764 0.95% ^ reload CSE regs 22.70 22.91 0.93% 250330 250100 -0.09% ^ thread pro- & epi 3.65 3.84 5.21% 51413 51675 0.51% ^ if-conversion 2 1.42 1.29 -9.15% 10773 11110 3.13% ^ peephole 2 2.62 2.60 -0.76% 21257 20995 -1.23% ^ rename registers 6.02 5.81 -3.49% 1854 1882 1.51% ^ scheduling 2 25.83 26.31 1.86% 28054 28653 2.14% ^ machine dep reorg 4.84 4.86 0.41% 10800 10859 0.55% ^ reorder blocks 5.36 5.57 3.92% 122503 125159 2.17% ^ reg stack 0.07 0.06 -14.29% 950 952 0.21% ^ final 9.47 9.32 -1.58% 15900 16153 1.59% ^ symout 1.04 0.92 -11.54% 15518 16064 3.52% ^ tree if-combine 0.12 0.14 16.67% 16 35 100.00% ^ tree |^ dominator 281.28 295.08 4.91% 7909617 8469092 7.07% ^ TOTAL 1229.19 1254.67 2.07%36827494 38214142 3.77%