+Mark who has done size optimization tuning with FDO. On Thu, Aug 4, 2011 at 7:05 AM, Mike Hommey <mhom...@mozilla.com> wrote: > Hi, > > We (Mozilla) are trying to get the best of the ARM toolchain for our > Android build. I recently built an Android Native-code Development Kit > with GCC 4.6.1 and binutils 2.21.53, instead of GCC 4.4.3 and binutils > 2.19 that come with the default NDK. > > LTO doesn't work at all, I'm getting an ICE that looks like the one from > bug 41159. > > FDO however, works, but sadly, the resulting build is not only quite > bigger,
Is this true for both 4.6 and 4.4 gcc? There is a bug in 4.6 that prevents cold functions from be optimized for size with FDO. The bug was fixed in trunk recently. > it's also slower on some tests (the Sunspider javascript > benchmark). While we have seen improvements on other tests (most > notably, the V8 benchmark is much faster) by switching to GCC 4.6 (that > is, without FDO), FDO doesn't seem to bring anything on the table. It > even seems to bring performance regression. ARM specific performance tuning (with FDO) seems needed. More parameters (e.g, in inliner related) may need to be made target dependent. > > Note that we do our normal builds with -Os and use -O3 for FDO. As for > architecture specific flags, we use -marmv7-a -mthumb -mfloat-abi=softfp > -mfpu=vfp. I've attempted a -O2 build in the past with GCC 4.4 but it > was both bigger and slower than the -Os builds. > > So, it pretty much looks like current aggressive optimizations hit > current hardware limitations and are slower than builds optimized for > size. Yes, this is very likely. Hardware profiling will be very useful to help identify the root cause. > > Has there been significant changes to the ARM backend that would justify > that I try some more with current GCC HEAD? Should I maybe try some more > with the linaro GCC branch? Are there things we can do to help getting > better ARM performance? It does not hurt to try it :) Thanks, David > > Cheers, > > Mike >