On Sat, Aug 20, 2011 at 7:13 AM, Zach Pfeffer <zach.pfef...@linaro.org> wrote: > Thanks Bero. Sending this extremely useful information out to a wider > audience. > > Alex, > > I think you're probably be very interested in this for your Mozilla work. > >>> -O3 >>> * What is is, does, available on >> >> -O3 enables several additional compiler optimizations such as tree >> vectorizing and loop unswitching, and optimizes for speed over code >> size somewhat more aggressively than -O2, e.g. by inlining all calls >> to small static functions. >> It is available on any platform supported by gcc. >> >>> OpenMP >>> * What is is, does, available on >> >> OpenMP is a simple API that makes it easier for a programmer to make >> use of multi-core or multi-processor systems, e.g. by automatically >> splitting marked loops into several threads. >> Example: >> >> #pragma omp parallel for >> for(int i=0; i<100; i++) >> do_something(i); >> >> Would use up to 100 threads to do its job. >> >> >> It is available on plaforms supported by gcc that can use libgomp, >> gcc's OpenMP library. This includes most platforms that support POSIX >> threads - but -- initially -- not Android. >> >> >>> Loop parallelization >>> * What is is, does, available on >> >> Loop parallelization takes OpenMP a step further by automatically >> determining which loops are suitable for "#pragma omp parallel for" >> and similar constructs. This allows code that was written without >> multiprocessing in mind (such as most code written specifically for >> ARM platforms - multicore/SMP ARM systems are quite new) to take >> advantage of multicore/SMP systems (to some extent) without having to >> modify the code. >> >> Compiler flag: -ftree-parallelize-loops=X (where X is the number of >> threads to be optimized for - typically the number of CPU cores in the >> target system) >> >> Available on anything supported by gcc that has both libgomp and >> graphite (incl. CLooG, PPL or ISL) - the original Android toolchain >> has neither of those. >> >>> ...and any other optimizations that you've done. >> >> None of the following is enabled yet (but the support in the toolchain >> is there now), but I'm planning to enable them step by step once we >> have systems built w/ the new toolchain that actually boot: >> >> binutils: --hash-style=gnu >> By default, ld creates SysV style hash tables for function tables >> in shared libraries. With --hash-style=gnu, we switch to GNU style >> hashes, making symbol lookup a lot faster. (details: >> http://sourceware.org/ml/binutils/2006-10/msg00377.html)
Sorry, silly question, but does Android use the glibc dynamic linker? If not, does its linker support other hash styles? >> binutils: -Bsymbolic-functions >> Speed up the dynamic linker by binding references to global >> functions in shared libraries where it is known that this doesn't >> break things (it's safe for libraries that don't have any users trying >> to override their symbols - it's probably safe to assume e.g. skia and >> opengl could benefit). >> (details: >> http://www.fkf.mpg.de/edv/docs/intel_composer/Documentation/en_US/compiler_f/main_for/copts/common_options/option_bsymbolic_functions.htm) >> >> binutils/gcc: -flto, -fwhole-program >> Link-Time Optimization - causes code to be optimized again at link >> time, when the compiler knows what functions are called form what >> parts of the code, what functions are only called with constant >> parameters, etc. >> >> gcc: -mtune=cortex-a9 (or whatever the actual target CPU is) >> The Android build system uses -march=arm-v7a, which is good -- but >> it doesn't do any tuning for the specifc CPU type (e.g. cortex-a8 vs. >> cortex-a9). Good. Using -march=armv7-a -mtune=cortex-a9 enables the Cortex-A8 fixups. Using a -mcpu=cortex-a9 disables them which means your build may not run on an A8. >> gcc: -fvisibility-inlines-hidden >> Don't export C++ inline methods in shared libraries. Makes the >> symbol table smaller, improving startup time and diskspace efficiency >> >> gcc: -fstrict-aliasing -Werror=strict-aliasing >> Currently, Android uses -fno-strict-aliasing unconditionally for >> thumb code, to work around some pieces of code that violate strict >> aliasing rules. Using -Werror=strict-aliasing, we can determine what >> pieces of code are affected, and fix them, or limit the use of >> -fno-strict-aliasing to the specific files that need it - enabling the >> rather useful strict-aliasing optimization for the rest of the build >> >> gcc: Investigate Graphite optimizations that aren't even enabled at -O3: >> -fgraphite-identity -floop-block -floop-interchage >> -floop-strip-mine -ftree-loop-distribution -ftree-loop-linear Looks good. I'd add SMS to the list as well: first -fmodulo-sched, then -fmodulo-sched -fmodulo-sched-allow-regmoves. _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev