The branch has regressed in terms of memory and compile time.
However, I think the regressions are localized.

I compiled tuples and [EMAIL PROTECTED] with --disable-checking
--enable-gather-detailed-mem-stats.  I ran cc1 and cc1plus over the
preprocessed code of cc1-i-files, FF3D, SPEC2000, TRAMP3D, PR 12850,
PR 15855 and PR 30089.

I had to disable PRE and FRE on both trunk and tuples because trunk
wouldn't bootstrap if I only disabled the same bits that are disabled
in tuples.

The attached is a comparison using the compile and memory figures
emitted by -ftime-report.  The "Before" column corresponds to
[EMAIL PROTECTED], the "After" column corresponds to [EMAIL PROTECTED]

The times and memory figures are aggregated over ALL the files in the
collection I described above.

Some preliminary notes:

- The branch is 25secs slower than mainline (2%).  Out of those 25
seconds, 15.31 seconds are in tree DSE, which is puzzling.  The other
10 seconds seem to be going to RTL DSE1 (3 secs), tree CCP (2 secs)
and various others more evenly distributed.  In fact, if you take away
tree DSE, the tree passes are faster on tuples.

- Memory utilization is tricky to check.  The difference is almost
totally going to the gimplifier and RTL expansion (about 900Kb of the
1.3Mb difference).  These two have trees and gimple coexisting for a
short period of time, particularly in the gimplifier.  However, the
trees that are used in RTL expansion are built and destroyed on the
spot, so that should not affect peak memory use, but the report
doesn't let me distinguish that.

- The rest of the memory utilization difference is mostly in inlining
(240Kb) and SSA update (50Kb).

I think the main focus points should be DSE and trying to get a good
way of measuring the memory utilization differences.  Jan, any
suggestion?


Thanks.  Diego.
                            Time                    Memory
Phase                  Before   After % change  Before   After % change
 ^ garbage collectio   54.90   56.01    2.02%       0       0 -100.00%
 ^ callgraph constru    7.04    7.58    7.67%  320099  205154  -35.91%
 ^ callgraph optimiz   12.01    9.98  -16.90%  274447  269807   -1.69%
 ^ ipa reference        1.46    1.14  -21.92%    1159    1514   30.63%
 ^ cfg cleanup         19.91   23.85   19.79%   49290   49184   -0.22%
 ^ trivially dead co    5.53    4.97  -10.13%       0       0 -100.00%
 ^ df reaching defs    11.13   11.74    5.48%       0       0 -100.00%
 ^ df live regs        25.94   26.95    3.89%       0       0 -100.00%
 ^ df reg dead/unuse    9.11    9.31    2.20%  152248  153839    1.05%
 ^ alias analysis       9.08    8.70   -4.19%  229345  229018   -0.14%
 ^ register scan        1.86    1.87    0.54%    1266    2053   62.16%
 ^ preprocessing       37.94   37.10   -2.21%  777544  776711   -0.11%
 ^ parser             182.56  184.84    1.25%1546317815807728    2.23%
 ^ name lookup         37.38   37.96    1.55% 1535795 1562059    1.71%
 ^ inline heuristics    6.43    4.12  -35.93%   98582   99750    1.18%
 ^ integration         31.66   33.25    5.02% 2585635 2824731    9.25%
 ^ tree gimplify       35.01   36.41    4.00% 1984557 2770492   39.60%
 ^ tree eh              1.77    1.89    6.78%   98729  130155   31.83%
 ^ tree CFG construc    6.57    4.31  -34.40%  720404  376314  -47.76%
 ^ tree CFG cleanup    17.99   15.96  -11.28%  120599  126975    5.29%
 ^ tree VRP            24.91   25.19    1.12%  830490  808712   -2.62%
 ^ tree copy propaga    7.41    7.71    4.05%   72143   72407    0.37%
 ^ tree find ref. va    1.82    1.58  -13.19%   72720   86277   18.64%
 ^ tree PTA             9.56    9.44   -1.26%   99931   97924   -2.01%
 ^ tree alias analys    4.47    4.53    1.34%   68751   71085    3.39%
 ^ tree call clobber    0.42    0.59   40.48%     887    1084   22.21%
 ^ tree flow sensiti    0.26    0.34   30.77%    4119    4397    6.75%
 ^ tree memory parti    1.46    1.43   -2.05%     612     538  -12.09%
 ^ tree PHI insertio    0.72    0.87   20.83%   11050   14721   33.22%
 ^ tree SSA rewrite     8.68    8.78    1.15%  512512  562206    9.70%
 ^ tree SSA other       1.86    2.15   15.59%       0    1008  100.00%
 ^ tree SSA incremen   15.31   16.84    9.99%  241694  295323   22.19%
 ^ tree operand scan   17.85   17.92    0.39% 1191734 1181479   -0.86%
 ^ dominator optimiz   23.55   22.96   -2.51%  640181  539452  -15.73%
 ^ tree SRA             2.55    2.57    0.78%   61375   66672    8.63%
 ^ tree STORE-CCP       2.64    2.32  -12.12%   13339   13020   -2.39%
 ^ tree CCP             6.60    8.47   28.33%   65710   66783    1.63%
 ^ tree PHI const/co    0.57    0.94   64.91%     305    4796  100.00%
 ^ tree split crit e    2.11    2.00   -5.21%  197589  196116   -0.75%
 ^ tree reassociatio    4.18    1.98  -52.63%   27810   26479   -4.79%
 ^ tree code sinking    1.40    1.33   -5.00%   12805   11777   -8.03%
 ^ tree linearize ph    0.59    0.53  -10.17%    1116    1305   16.94%
 ^ tree forward prop    2.79    2.94    5.38%   75293  108939   44.69%
 ^ tree phiprop         0.19    0.13  -31.58%    3161    3244    2.63%
 ^ tree conservative    3.84    3.07  -20.05%     540    2686  100.00%
 ^ tree aggressive D    1.05    0.96   -8.57%      10      52  100.00%
 ^ tree DSE            47.94   63.25   31.94%    8894    8348   -6.14%
 ^ PHI merge            0.25    0.23   -8.00%   11257   11741    4.30%
 ^ tree loop bounds     1.25    1.24   -0.80%   19002   18986   -0.08%
 ^ loop invariant mo    1.69    2.12   25.44%    9010    9482    5.24%
 ^ tree canonical iv    0.73    0.62  -15.07%   13110   13182    0.55%
 ^ scev constant pro    1.20    1.11   -7.50%   27378   26964   -1.51%
 ^ complete unrollin    4.44    4.34   -2.25%  205782  207906    1.03%
 ^ tree iv optimizat    8.34    8.61    3.24%  358903  345112   -3.84%
 ^ tree loop init       1.20    1.35   12.50%   24900   26327    5.73%
 ^ tree copy headers    2.31    2.23   -3.46%  121190  125617    3.65%
 ^ tree SSA to norma    7.66    7.45   -2.74%  233190  288802   23.85%
 ^ tree NRV optimiza    0.05    0.06   20.00%      15      43  100.00%
 ^ tree rename SSA c    1.24    1.90   53.23%      44      51   15.91%
 ^ dominance computa   13.92   13.18   -5.32%       0       0 -100.00%
 ^ expand              76.00   75.85   -0.20% 1785602 1980584   10.92%
 ^ varconst             4.83    4.76   -1.45%  158261  163395    3.24%
 ^ jump                 1.18    0.78  -33.90%   47174   46641   -1.13%
 ^ forward prop         6.06    5.84   -3.63%  141222  138674   -1.80%
 ^ CSE                 36.16   36.98    2.27%   83407   83559    0.18%
 ^ dead store elim1    30.02   33.79   12.56%   84570   85554    1.16%
 ^ dead store elim2     9.24    9.13   -1.19%   86389   88339    2.26%
 ^ loop analysis        4.21    4.15   -1.43%   40865   40411   -1.11%
 ^ CPROP 1              3.19    3.25    1.88%   42484   42457   -0.06%
 ^ PRE                 11.49   12.28    6.88%   53341   53737    0.74%
 ^ CPROP 2              7.36    7.95    8.02%   64351   63455   -1.39%
 ^ bypass jumps         6.29    6.70    6.52%   54337   55215    1.62%
 ^ CSE 2               12.44   13.43    7.96%   28327   28417    0.32%
 ^ branch prediction    5.45    5.43   -0.37%  101423   97966   -3.41%
 ^ combiner            33.79   33.95    0.47%  433896  430690   -0.74%
 ^ if-conversion        5.74    5.65   -1.57%   57406   57517    0.19%
 ^ regmove              7.19    7.13   -0.83%    7210    7239    0.40%
 ^ scheduling          25.83   26.31    1.86%   28054   28653    2.14%
 ^ local alloc         19.05   19.31    1.36%  127262  129019    1.38%
 ^ global alloc        38.84   38.31   -1.36%  171131  172764    0.95%
 ^ reload CSE regs     22.70   22.91    0.93%  250330  250100   -0.09%
 ^ thread pro- & epi    3.65    3.84    5.21%   51413   51675    0.51%
 ^ if-conversion 2      1.42    1.29   -9.15%   10773   11110    3.13%
 ^ peephole 2           2.62    2.60   -0.76%   21257   20995   -1.23%
 ^ rename registers     6.02    5.81   -3.49%    1854    1882    1.51%
 ^ scheduling 2        25.83   26.31    1.86%   28054   28653    2.14%
 ^ machine dep reorg    4.84    4.86    0.41%   10800   10859    0.55%
 ^ reorder blocks       5.36    5.57    3.92%  122503  125159    2.17%
 ^ reg stack            0.07    0.06  -14.29%     950     952    0.21%
 ^ final                9.47    9.32   -1.58%   15900   16153    1.59%
 ^ symout               1.04    0.92  -11.54%   15518   16064    3.52%
 ^ tree if-combine      0.12    0.14   16.67%      16      35  100.00%
 ^ tree |^ dominator  281.28  295.08    4.91% 7909617 8469092    7.07%
 ^ TOTAL             1229.19 1254.67    2.07%36827494 38214142    3.77%

Reply via email to