https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108874

            Bug ID: 108874
           Summary: [10/11/12/13 Regression] Missing bswap detection
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

If we look at the arm testcases in gcc.target/arm/rev16.c
typedef unsigned int __u32;

__u32
__rev16_32_alt (__u32 x)
{
  return (((__u32)(x) & (__u32)0xff00ff00UL) >> 8)
         | (((__u32)(x) & (__u32)0x00ff00ffUL) << 8);
}

__u32
__rev16_32 (__u32 x)
{
  return (((__u32)(x) & (__u32)0x00ff00ffUL) << 8)
         | (((__u32)(x) & (__u32)0xff00ff00UL) >> 8);
}

we should be able to generate rev16 instructions for aarch64 (and arm) i.e.
recognise a __builtin_bswap16 essentially.
GCC fails to do so and generates:
__rev16_32_alt:
        lsr     w1, w0, 8
        lsl     w0, w0, 8
        and     w1, w1, 16711935
        and     w0, w0, -16711936
        orr     w0, w1, w0
        ret
__rev16_32:
        lsl     w1, w0, 8
        lsr     w0, w0, 8
        and     w1, w1, -16711936
        and     w0, w0, 16711935
        orr     w0, w1, w0
        ret

whereas clang manages to recognise it all into:
__rev16_32_alt:                         // @__rev16_32_alt
        rev16   w0, w0
        ret
__rev16_32:                             // @__rev16_32
        rev16   w0, w0
        ret

does the bswap pass need some tweaking perhaps?

Looks like this worked fine with GCC 5 but broke in the GCC 6 timeframe so
marking as a regression

Reply via email to