> Hi, > > On 28 January 2014 13:10, Ramana Radhakrishnan > <ramana....@googlemail.com> wrote: > > On Fri, Jan 24, 2014 at 5:16 PM, Ian Bolton <ian.bol...@arm.com> > wrote: > >> Hi there! > >> > >> An existing optimisation for Thumb-2 converts t32 encodings to > >> t16 encodings to reduce codesize, at the expense of causing > >> redundant flag setting for ADD, AND, etc. This redundant flag > >> setting can have negative performance impact on cortex-a15. > >> > >> This patch introduces two new tuning options so that the conversion > >> from t32 to t16, which takes place in thumb2_reorg, can be > suppressed > >> for cortex-a15. > >> > >> To maintain some of the original benefit (reduced codesize), the > >> suppression is only done where the enclosing basic block is deemed > >> worthy of optimising for speed. > >> > >> This tested with no regressions and performance has improved for > >> the workloads tested on cortex-a15. (It might be beneficial to > >> other processors too, but that has not been investigated yet.) > >> > >> OK for stage 1? > > > > This is OK for stage1. > > > > Ramana > > > >> > >> Cheers, > >> Ian > >> > >> > >> 2014-01-24 Ian Bolton <ian.bol...@arm.com> > >> > >> gcc/ > >> * config/arm/arm-protos.h (tune_params): New struct members. > >> * config/arm/arm.c: Initialise tune_params per processor. > >> (thumb2_reorg): Suppress conversion from t32 to t16 when > >> optimizing for speed, based on new tune_params. > > This causes > gcc.target/arm/negdi-1.c > gcc.target/arm/negdi-2.c > to FAIL when GCC is configured as: > --with-mode=ar > --with-cpu=cortex-a15 > --with-fpu=neon-vfpv4 > > both tests used to PASS. > (see http://cbuild.validation.linaro.org/build/cross- > validation/gcc/209561/report-build-info.html)
Hi Christophe, I don't recall the failure when I did the work, but I see now that the test is looking for negs when my patch is specifically trying to avoid flag-setting operations. So we are now getting an rsb instead of a negs, as intended, and the test needs fixing! Open question: Should I look for either rsb or negs in a single scan-assembler or look for different ones dependent on the cpu in question or just not run the test for cortex-a15? Cheers, Ian