2010/11/15 Jan Hubicka <hubi...@ucw.cz>:
>> For peak, FDO is the most effective option. It can boost performance
>> by 7-10% depending on the program. The options you suggested probably
>> won't make too big a dent.  -funroll-loops can hurt performance
>> without profiling.  More aggressive inlining, ipa-cp, unswitching etc
>
> -funroll-loops overall was 2.2% win on SPECint, -funrol-all-loops 2.5% last
> time I noted down the SPECint results of this (that was in 2003, heh :)
> http://www.ucw.cz/~hubicka/papers/amd64/node4.html
>
>> enabled by O3 may help a little if there is any. -ffast-math won't
>> help for integer benchmarks other than eon.  Traditionally, O3 helps
>> FP performance because of the loop transformation enabled, but this
>> won't be the case for gcc for now.
>
> Function inlining definitly helps. -O3 also imply vectorization and other 
> stuff.

Indeed.  You can look at the various testers at gcc.opensuse.org which compare
-O2 vs. -O3 but also -O3 vs. -O3 -funroll-loops (and other things) to
get an idea
what helps and what not.

Richard.

> Honza
>>
>> Thanks,
>>
>> David
>>
>> On Mon, Nov 15, 2010 at 4:29 AM, Andrey Belevantsev <a...@ispras.ru> wrote:
>> > Hello,
>> >
>> > On 14.11.2010 0:08, Xinliang David Li wrote:
>> >>
>> >> I re-measured the performance difference using trunk gcc and trunk
>> >> clang/llvm on a core-2 box.  -fno-strict-aliasing is added to gcc
>> >> because clang/llvm's type based aliasing is not incomplete and not
>> >> enabled by default. I also added -fomit-frame-pointer to clang/llvm as
>> >> this is gcc's default. The base option is -O2.
>> >
>> > It would be very interesting to compare also peak numbers, i.e. with LTO 
>> > and
>> > strict aliasing enabled, as well as -O3 and -ffast-math/-funroll-loops,
>> > similar to Vlad's or OpenSUSE's options.  Can you try to measure these?
>> > Maybe you can also run SPEC2k6, if there is enough machine resources, but
>> > that's probably asking too much...
>> >
>> > Andrey
>> >
>> >
>

Reply via email to