More performance data: -O2 -funroll-all-loops vs O2: +1.1% geomean
O2 O2 unroll-all-loops 164.gzip 1324 1336 0.94% 175.vpr 1694 1670 -1.44% 176.gcc 2293 2353 2.60% 181.mcf 1772 1793 1.20% 186.crafty 2320 2300 -0.86% 197.parser 1166 1171 0.39% 252.eon 2443 2515 2.93% 253.perlbmk 2410 2250 -6.66% 254.gap 1987 2041 2.68% 255.vortex 2392 2411 0.78% 256.bzip2 1719 1806 5.08% 300.twolf 2288 2436 6.44% -O3 -flto -fwhole-program vs -O2 : geomean +6% (-fwhole-program add ~1% ) 164.gzip 1324 1318 -0.45% 175.vpr 1694 1717 1.34% 176.gcc 2293 2359 2.88% 181.mcf 1772 1772 0.02% 186.crafty 2320 2526 8.86% 197.parser 1166 1248 7.04% 252.eon 2443 2898 18.59% 253.perlbmk 2410 2323 -3.62% 254.gap 1987 2039 2.58% 255.vortex 2392 2918 21.99% 256.bzip2 1719 1946 13.19% 300.twolf 2288 2342 2.34% -O2 -flto -fwhole-program vs -O2: geomean +3.4% . mainly from three programs: vortex, eon and bzip2. 164.gzip 1324 1313 -0.82% 175.vpr 1694 1659 -2.05% 176.gcc 2293 2300 0.30% 181.mcf 1772 1781 0.52% 186.crafty 2320 2327 0.30% 197.parser 1166 1188 1.92% 252.eon 2443 2664 9.00% 253.perlbmk 2410 2470 2.47% 254.gap 1987 1987 -0.02% 255.vortex 2392 2883 20.53% 256.bzip2 1719 1839 7.00% 300.twolf 2288 2365 3.34% Thanks, David On Mon, Nov 15, 2010 at 5:50 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> On Mon, Nov 15, 2010 at 5:39 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> >> > Fortunately linker plugin solves the problem here and this is why I >> >> > want to >> >> > have it by default. GCC then can do effectively -fwhole-program for >> >> > binaries >> >> > (since linker knows what will be bound elsewhere) and take advantage of >> >> > visibility((hidden)) hints for shared libraries same way. Most of >> >> > important >> >> > shared libraries gets visibility ((hidden)) right. >> >> > >> >> > It is sad that LTO w/o linker plugin doesn't give that much benefits. >> >> > Ideas are welcome here. >> >> >> >> Linker feedback will be limited here -- mostly global variable >> >> aliasing (as I remember only 2/3 spec programs benefit from it), it >> >> helps You don't get whole program points-to, whole program mod-ref >> >> (with context sensitivity), whole program structure layout. The latter >> >> are the real kickers (in terms of SPEC performance), but promoting LTO >> >> with those numbers can be misleading as many programs won't get it. >> > >> > Well, I am speaking of our linker plugin here. What it does is to pass GCC >> > resolution information so it knows what symbols are bound externally. Since >> > typically you link LTO alone or with small non-LTO part, most of symbols >> > are >> > not bound and thus effecitvely you get -fwhole-program (-fwhole-program >> > just >> > declare everything static except for main ()) >> > >> > We don't really do whole program points-to or structure layout. >> >> gcc will eventually, right? > > Sure hope so ;) > We really need to solve scalability with our IPA points-to and make it > compatible with WHOPR. >> >> > Mod-ref is just >> > simple ipa-reference code. How you get context sensitivity on mod/ref? >> >> mod-ref relies on points-to. With context sensitive points-to, you can >> also get CS mod-ref -- basically mod-ref info per callsite. > > Ah sure, I was too focused on our current "mod/ref" :) > > Honza >