Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

Richard Biener Wed, 25 Jun 2014 03:33:57 -0700

On Wed, Jun 25, 2014 at 11:53 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Wed, Jun 25, 2014 at 5:47 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Wed, Jun 25, 2014 at 5:26 PM, Bingfeng Mei <b...@broadcom.com> wrote:
>>> Thanks for nice benchmarks. Vladimir.
>>>
>>> Why is GCC code size so much bigger than LLVM? Does -Ofast have more 
>>> unrolling
>> On the contrary, I don't think rtl unrolling is enabled by default on
>> GCC with level O3/Ofast. There is no unroll dump file at all unless
>> -funroll-loops/-funroll-all-loops is explicitly specified.
> Need to clarify, I did see cases in which GCC's rtl unroller more
> aggressive than llvm's once it's specified.


At -O3 you get more aggressive complete peeling from the GIMPLE
cunroll pass.

Richard.

>> Thanks,
>> bin
>>
>>> on GCC? It doesn't seem increasing code size help performance (164.gzip & 
>>> 197.parser)
>>> Is there comparisons for O2? I guess that is more useful for typical
>>> mobile/embedded programmers.
>>>
>>> Bingfeng
>>>
>>>> -----Original Message-----
>>>> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of
>>>> Vladimir Makarov
>>>> Sent: 24 June 2014 16:07
>>>> To: Ramana Radhakrishnan; gcc.gcc.gnu.org
>>>> Subject: Re: Comparison of GCC-4.9 and LLVM-3.4 performance on
>>>> SPECInt2000 for x86-64 and ARM
>>>>
>>>> On 06/24/2014 10:57 AM, Ramana Radhakrishnan wrote:
>>>> >
>>>> > The ball-park number you have probably won't change much.
>>>> >
>>>> >>>
>>>> >> Unfortunately, that is the configuration I can use on my system
>>>> because
>>>> >> of lack of libraries for other configurations.
>>>> >
>>>> > Using --with-fpu={neon / neon-vfpv4} shouldn't cause you ABI issues
>>>> > with libraries for any other configurations. neon / neon-vfpv4 enable
>>>> > use of the neon unit in a manner that is ABI compatible with the rest
>>>> > of the system.
>>>> >
>>>> > For more on command line options for AArch32 and how they map to
>>>> > various CPU's you might find this blog interesting.
>>>> >
>>>> > http://community.arm.com/groups/tools/blog/2013/04/15/arm-cortex-a-
>>>> processors-and-gcc-command-lines
>>>> >
>>>> >
>>>> >>
>>>> >> I don't think Neon can improve score for SPECInt2000 significantly
>>>> but
>>>> >> may be I am wrong.
>>>> >
>>>> > It won't probably improve the overall score by a large amount but some
>>>> > individual benchmarks will get some help.
>>>> >
>>>> There are some few benchmarks which benefit from autovectorization (eon
>>>> particularly).
>>>> >>> Did you add any other architecture specific options to your SPEC2k
>>>> >>> runs ?
>>>> >>>
>>>> >>>
>>>> >> No.  The only options I used are -Ofast.
>>>> >>
>>>> >> Could you recommend me what best options you think I should use for
>>>> this
>>>> >> processor.
>>>> >>
>>>> >
>>>> > I would personally use --with-cpu=cortex-a15 --with-fpu=neon-vfpv4
>>>> > --with-float=hard on this processor as that maps with the processor
>>>> > available on that particular piece of Silicon.
>>>> Thanks, Ramana.  Next time, I'll try these options.
>>>> >
>>>> > Also given it's a big LITTLE system with probably kernel switching -
>>>> > it may be better to also make sure that you are always running on the
>>>> > big core.
>>>> >
>>>> The results are pretty stable.  Also this version of Fedora does not
>>>> implement switching from Big to Little processors.
>>>

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

Reply via email to