More performance data:
-O2 -funroll-all-loops vs O2: +1.1% geomean
O2 O2 unroll-all-loops
164.gzip 1324 1336 0.94%
175.vpr 1694 1670 -1.44%
176.gcc 2293 2353 2.60%
181.mcf 1772 1793 1.20%
186.crafty 2320 2300 -0.86%
197.parser 1166 1171 0.39%
252.eon 2443 2515 2.93%
253.perlbmk 2410 2250 -6.66%
254.gap 1987 2041 2.68%
255.vortex 2392 2411 0.78%
256.bzip2 1719 1806 5.08%
300.twolf 2288 2436 6.44%
-O3 -flto -fwhole-program vs -O2 : geomean +6% (-fwhole-program add ~1% )
164.gzip 1324 1318 -0.45%
175.vpr 1694 1717 1.34%
176.gcc 2293 2359 2.88%
181.mcf 1772 1772 0.02%
186.crafty 2320 2526 8.86%
197.parser 1166 1248 7.04%
252.eon 2443 2898 18.59%
253.perlbmk 2410 2323 -3.62%
254.gap 1987 2039 2.58%
255.vortex 2392 2918 21.99%
256.bzip2 1719 1946 13.19%
300.twolf 2288 2342 2.34%
-O2 -flto -fwhole-program vs -O2: geomean +3.4% . mainly from three
programs: vortex, eon and bzip2.
164.gzip 1324 1313 -0.82%
175.vpr 1694 1659 -2.05%
176.gcc 2293 2300 0.30%
181.mcf 1772 1781 0.52%
186.crafty 2320 2327 0.30%
197.parser 1166 1188 1.92%
252.eon 2443 2664 9.00%
253.perlbmk 2410 2470 2.47%
254.gap 1987 1987 -0.02%
255.vortex 2392 2883 20.53%
256.bzip2 1719 1839 7.00%
300.twolf 2288 2365 3.34%
Thanks,
David
On Mon, Nov 15, 2010 at 5:50 PM, Jan Hubicka <[email protected]> wrote:
>> On Mon, Nov 15, 2010 at 5:39 PM, Jan Hubicka <[email protected]> wrote:
>> >> > Fortunately linker plugin solves the problem here and this is why I
>> >> > want to
>> >> > have it by default. GCC then can do effectively -fwhole-program for
>> >> > binaries
>> >> > (since linker knows what will be bound elsewhere) and take advantage of
>> >> > visibility((hidden)) hints for shared libraries same way. Most of
>> >> > important
>> >> > shared libraries gets visibility ((hidden)) right.
>> >> >
>> >> > It is sad that LTO w/o linker plugin doesn't give that much benefits.
>> >> > Ideas are welcome here.
>> >>
>> >> Linker feedback will be limited here -- mostly global variable
>> >> aliasing (as I remember only 2/3 spec programs benefit from it), it
>> >> helps You don't get whole program points-to, whole program mod-ref
>> >> (with context sensitivity), whole program structure layout. The latter
>> >> are the real kickers (in terms of SPEC performance), but promoting LTO
>> >> with those numbers can be misleading as many programs won't get it.
>> >
>> > Well, I am speaking of our linker plugin here. What it does is to pass GCC
>> > resolution information so it knows what symbols are bound externally. Since
>> > typically you link LTO alone or with small non-LTO part, most of symbols
>> > are
>> > not bound and thus effecitvely you get -fwhole-program (-fwhole-program
>> > just
>> > declare everything static except for main ())
>> >
>> > We don't really do whole program points-to or structure layout.
>>
>> gcc will eventually, right?
>
> Sure hope so ;)
> We really need to solve scalability with our IPA points-to and make it
> compatible with WHOPR.
>>
>> > Mod-ref is just
>> > simple ipa-reference code. How you get context sensitivity on mod/ref?
>>
>> mod-ref relies on points-to. With context sensitive points-to, you can
>> also get CS mod-ref -- basically mod-ref info per callsite.
>
> Ah sure, I was too focused on our current "mod/ref" :)
>
> Honza
>