https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71016
Bug ID: 71016 Summary: [6/7 Regression] Redundant sign extension with conditional __builtin_clzl Product: gcc Version: 6.1.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: aarch64 Consider: long int foo (long i) { return (i == 0) ? 17 : __builtin_clzl (i); } On aarch64 with GCC 5 at -O2 we generate: foo: mov x2, 17 clz x1, x0 cmp x0, xzr csel x0, x1, x2, ne ret whereas with GCC 6 and trunk we generate: foo: mov w1, 17 clz x2, x0 cmp x0, 0 csel w0, w1, w2, eq sxtw x0, w0 //redundant sign-extend ret In GCC 5 the tree structure being expanded is: i.1_3 = (long unsigned int) i_2(D); _4 = __builtin_clzl (i.1_3); iftmp.0_5 = (long int) _4; ;; basic block 4, loop depth 0 ;; pred: 3 ;; 2 # iftmp.0_1 = PHI <iftmp.0_5(3), 17(2)> return iftmp.0_1; so the RTL optimisers see the DI mode clz followed by a subreg and a sign-extend and optimise it away. However trunk now moves the sign-extend out of the conditional: i.1_3 = (long unsigned int) i_2(D); _4 = __builtin_clzl (i.1_3); ;; basic block 4, loop depth 0 ;; pred: 3 ;; 2 # _7 = PHI <_4(3), 17(2)> prephitmp_8 = (long int) _7; return prephitmp_8; So it's very hard for RTL ifcvt or combine or ree to catch it.