https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71016

            Bug ID: 71016
           Summary: [6/7 Regression] Redundant sign extension with
                    conditional __builtin_clzl
           Product: gcc
           Version: 6.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

Consider:
long int foo (long i)
{
  return (i == 0) ? 17 : __builtin_clzl (i);
}

On aarch64 with GCC 5 at -O2 we generate:
foo:
        mov     x2, 17
        clz     x1, x0
        cmp     x0, xzr
        csel    x0, x1, x2, ne
        ret

whereas with GCC 6 and trunk we generate:
foo:
        mov     w1, 17
        clz     x2, x0
        cmp     x0, 0
        csel    w0, w1, w2, eq
        sxtw    x0, w0  //redundant sign-extend
        ret


In GCC 5 the tree structure being expanded is:

  i.1_3 = (long unsigned int) i_2(D);
  _4 = __builtin_clzl (i.1_3);
  iftmp.0_5 = (long int) _4;

;;   basic block 4, loop depth 0
;;    pred:       3
;;                2
  # iftmp.0_1 = PHI <iftmp.0_5(3), 17(2)>
  return iftmp.0_1;

so the RTL optimisers see the DI mode clz followed by a subreg and
a sign-extend and optimise it away.

However trunk now moves the sign-extend out of the conditional:
  i.1_3 = (long unsigned int) i_2(D);
  _4 = __builtin_clzl (i.1_3);

;;   basic block 4, loop depth 0
;;    pred:       3
;;                2
  # _7 = PHI <_4(3), 17(2)>
  prephitmp_8 = (long int) _7;
  return prephitmp_8;

So it's very hard for RTL ifcvt or combine or ree to catch it.

Reply via email to