On Sat, Aug 20, 2011 at 7:13 AM, Zach Pfeffer <zach.pfef...@linaro.org> wrote:
> Thanks Bero. Sending this extremely useful information out to a wider 
> audience.
>
> Alex,
>
> I think you're probably be very interested in this for your Mozilla work.
>
>>>   -O3
>>>      * What is is, does, available on
>>
>> -O3 enables several additional compiler optimizations such as tree
>> vectorizing and loop unswitching, and optimizes for speed over code
>> size somewhat more aggressively than -O2, e.g. by inlining all calls
>> to small static functions.
>> It is available on any platform supported by gcc.
>>
>>>   OpenMP
>>>      * What is is, does, available on
>>
>> OpenMP is a simple API that makes it easier for a programmer to make
>> use of multi-core or multi-processor systems, e.g. by automatically
>> splitting marked loops into several threads.
>> Example:
>>
>> #pragma omp parallel for
>> for(int i=0; i<100; i++)
>>    do_something(i);
>>
>> Would use up to 100 threads to do its job.
>>
>>
>> It is available on plaforms supported by gcc that can use libgomp,
>> gcc's OpenMP library. This includes most platforms that support POSIX
>> threads - but -- initially -- not Android.
>>
>>
>>>   Loop parallelization
>>>      * What is is, does, available on
>>
>> Loop parallelization takes OpenMP a step further by automatically
>> determining which loops are suitable for "#pragma omp parallel for"
>> and similar constructs. This allows code that was written without
>> multiprocessing in mind (such as most code written specifically for
>> ARM platforms - multicore/SMP ARM systems are quite new) to take
>> advantage of multicore/SMP systems (to some extent) without having to
>> modify the code.
>>
>> Compiler flag: -ftree-parallelize-loops=X (where X is the number of
>> threads to be optimized for - typically the number of CPU cores in the
>> target system)
>>
>> Available on anything supported by gcc that has both libgomp and
>> graphite (incl. CLooG, PPL or ISL) - the original Android toolchain
>> has neither of those.
>>
>>> ...and any other optimizations that you've done.
>>
>> None of the following is enabled yet (but the support in the toolchain
>> is there now), but I'm planning to enable them step by step once we
>> have systems built w/ the new toolchain that actually boot:
>>
>> binutils: --hash-style=gnu
>>    By default, ld creates SysV style hash tables for function tables
>> in shared libraries. With --hash-style=gnu, we switch to GNU style
>> hashes, making symbol lookup a lot faster. (details:
>> http://sourceware.org/ml/binutils/2006-10/msg00377.html)

Sorry, silly question, but does Android use the glibc dynamic linker?
If not, does its linker support other hash styles?

>> binutils: -Bsymbolic-functions
>>    Speed up the dynamic linker by binding references to global
>> functions in shared libraries where it is known that this doesn't
>> break things (it's safe for libraries that don't have any users trying
>> to override their symbols - it's probably safe to assume e.g. skia and
>> opengl could benefit).
>> (details: 
>> http://www.fkf.mpg.de/edv/docs/intel_composer/Documentation/en_US/compiler_f/main_for/copts/common_options/option_bsymbolic_functions.htm)
>>
>> binutils/gcc: -flto, -fwhole-program
>>    Link-Time Optimization - causes code to be optimized again at link
>> time, when the compiler knows what functions are called form what
>> parts of the code, what functions are only called with constant
>> parameters, etc.
>>
>> gcc: -mtune=cortex-a9 (or whatever the actual target CPU is)
>>    The Android build system uses -march=arm-v7a, which is good -- but
>> it doesn't do any tuning for the specifc CPU type (e.g. cortex-a8 vs.
>> cortex-a9).

Good.  Using -march=armv7-a -mtune=cortex-a9 enables the Cortex-A8
fixups.  Using a -mcpu=cortex-a9 disables them which means your build
may not run on an A8.

>> gcc: -fvisibility-inlines-hidden
>>    Don't export C++ inline methods in shared libraries. Makes the
>> symbol table smaller, improving startup time and diskspace efficiency
>>
>> gcc: -fstrict-aliasing -Werror=strict-aliasing
>>    Currently, Android uses -fno-strict-aliasing unconditionally for
>> thumb code, to work around some pieces of code that violate strict
>> aliasing rules. Using -Werror=strict-aliasing, we can determine what
>> pieces of code are affected, and fix them, or limit the use of
>> -fno-strict-aliasing to the specific files that need it - enabling the
>> rather useful strict-aliasing optimization for the rest of the build
>>
>> gcc: Investigate Graphite optimizations that aren't even enabled at -O3:
>>   -fgraphite-identity -floop-block -floop-interchage
>> -floop-strip-mine -ftree-loop-distribution -ftree-loop-linear

Looks good.  I'd add SMS to the list as well:  first -fmodulo-sched,
then -fmodulo-sched -fmodulo-sched-allow-regmoves.

_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

Reply via email to