Hi, Recently we've put a lot of effort into improving ifcvt to use CSEL on AArch64. In https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01639.html James determined the best value for AArch64 code generation. Although this setting is used when explicitly targeting Cortex cores, it is not otherwise used. This means by default GCC will not use (F)CSEL in many common cases. Most code is built without -mcpu= and thus doesn't use CSEL like this example from GLIBC:
strtok: stp x29, x30, [sp, -48]! add x29, sp, 0 stp x21, x22, [sp, 32] mov x21, x1 stp x19, x20, [sp, 16] adrp x22, .LANCHOR0 mov x19, x0 cbz x0, .L12 .L2: ldrb w0, [x19] .L12: ldr x19, [x22, #:lo12:.LANCHOR0] b .L2 With -mcpu=cortex-a57 GCC generates: stp x29, x30, [sp, -48]! cmp x0, 0 add x29, sp, 0 stp x21, x22, [sp, 32] adrp x21, .LANCHOR0 stp x19, x20, [sp, 16] mov x19, x0 ldr x0, [x21, #:lo12:.LANCHOR0] csel x19, x0, x19, eq ldrb w0, [x19] This is generally faster and smaller. On one benchmark the new setting fixes a regression since GCC6 and improves performance by 49%. So I propose to change generic_branch_cost to be the same as cortexa57_branch_cost so that all supported cores benefit equally from CSEL. Are there any objections to this? Wilco ChangeLog: 2017-03-09 Wilco Dijkstra <wdijk...@arm.com> * config/aarch64/aarch64.c (generic_branch_cost): Copy cortexa57_branch_cost. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5870b5e5d7e8e48cf925b3a62030346f041a7fd6..ea16074af86087a6200d9895583e05acf43d90e2 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -377,8 +377,8 @@ static const struct cpu_vector_cost xgene1_vector_cost = /* Generic costs for branch instructions. */ static const struct cpu_branch_cost generic_branch_cost = { - 2, /* Predictable. */ - 2 /* Unpredictable. */ + 1, /* Predictable. */ + 3 /* Unpredictable. */ }; /* Branch costs for Cortex-A57. */