More performance data:

-O2 -funroll-all-loops vs O2:   +1.1% geomean

                                          O2               O2 unroll-all-loops
            164.gzip                1324                1336      0.94%
             175.vpr                1694                1670     -1.44%
             176.gcc                2293                2353      2.60%
             181.mcf                1772                1793      1.20%
          186.crafty                2320                2300     -0.86%
          197.parser                1166                1171      0.39%
             252.eon                2443                2515      2.93%
         253.perlbmk                2410                2250     -6.66%
             254.gap                1987                2041      2.68%
          255.vortex                2392                2411      0.78%
           256.bzip2                1719                1806      5.08%
           300.twolf                2288                2436      6.44%


-O3 -flto -fwhole-program vs -O2  : geomean +6%     (-fwhole-program add ~1% )

            164.gzip                1324                1318     -0.45%
             175.vpr                1694                1717      1.34%
             176.gcc                2293                2359      2.88%
             181.mcf                1772                1772      0.02%
          186.crafty                2320                2526      8.86%
          197.parser                1166                1248      7.04%
             252.eon                2443                2898     18.59%
         253.perlbmk                2410                2323     -3.62%
             254.gap                1987                2039      2.58%
          255.vortex                2392                2918     21.99%
           256.bzip2                1719                1946     13.19%
           300.twolf                2288                2342      2.34%


-O2 -flto -fwhole-program vs -O2: geomean +3.4% . mainly from three
programs: vortex, eon and bzip2.

            164.gzip                1324                1313     -0.82%
             175.vpr                1694                1659     -2.05%
             176.gcc                2293                2300      0.30%
             181.mcf                1772                1781      0.52%
          186.crafty                2320                2327      0.30%
          197.parser                1166                1188      1.92%
             252.eon                2443                2664      9.00%
         253.perlbmk                2410                2470      2.47%
             254.gap                1987                1987     -0.02%
          255.vortex                2392                2883     20.53%
           256.bzip2                1719                1839      7.00%
           300.twolf                2288                2365      3.34%


Thanks,

David


On Mon, Nov 15, 2010 at 5:50 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> On Mon, Nov 15, 2010 at 5:39 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> >> > Fortunately linker plugin solves the problem here and this is why I 
>> >> > want to
>> >> > have it by default.  GCC then can do effectively -fwhole-program for 
>> >> > binaries
>> >> > (since linker knows what will be bound elsewhere) and take advantage of
>> >> > visibility((hidden)) hints for shared libraries same way.  Most of 
>> >> > important
>> >> > shared libraries gets visibility ((hidden)) right.
>> >> >
>> >> > It is sad that LTO w/o linker plugin doesn't give that much benefits.
>> >> > Ideas are welcome here.
>> >>
>> >> Linker feedback will be limited here -- mostly global variable
>> >> aliasing (as I remember only 2/3 spec programs benefit from it), it
>> >> helps  You don't get whole program points-to, whole program mod-ref
>> >> (with context sensitivity), whole program structure layout. The latter
>> >> are the real kickers (in terms of SPEC performance), but promoting LTO
>> >> with those numbers can be misleading as many programs won't get it.
>> >
>> > Well, I am speaking of our linker plugin here.  What it does is to pass GCC
>> > resolution information so it knows what symbols are bound externally. Since
>> > typically you link LTO alone or with small non-LTO part, most of symbols 
>> > are
>> > not bound and thus effecitvely you get -fwhole-program (-fwhole-program 
>> > just
>> > declare everything static except for main ())
>> >
>> > We don't really do whole program points-to or structure layout.
>>
>> gcc will eventually, right?
>
> Sure hope so ;)
> We really need to solve scalability with our IPA points-to and make it
> compatible with WHOPR.
>>
>> > Mod-ref is just
>> > simple ipa-reference code. How you get context sensitivity on mod/ref?
>>
>> mod-ref relies on points-to. With context sensitive points-to, you can
>> also get CS mod-ref -- basically mod-ref info per callsite.
>
> Ah sure, I was too focused on our current "mod/ref" :)
>
> Honza
>

Reply via email to