https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116560

            Bug ID: 116560
           Summary: RISC-V : rv32 code optimization , big code difference
                    between 8/9.x and 10.x
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jmic...@georgiatech-metz.fr
  Target Milestone: ---

C code (mostly swap big-endian to u16 variable + a test) :

#include <stdint.h>

uint8_t test_swap(uint8_t *ptr, uint32_t *res)
{
    uint16_t temp=(ptr[0]<<8)|ptr[1];
    *res=temp;
    if (temp==0) return 1;
    return 0;
}

Up to 9.x, with -O2, the generated code is great , with 1 OR:

test_swap:
        lbu     a5,0(a0)
        slli    a5,a5,8
        lbu     a0,1(a0)
        or      a0,a5,a0
        sw      a0,0(a1)
        seqz    a0,a0
        ret

After 9.x (tested on godbolt.org - https://godbolt.org/z/f38EaosxM ),
useless slli/srli, and 2 OR...

test_swap:
        lbu     a5,1(a0)
        lbu     a4,0(a0)
        slli    a0,a5,8
        or      a0,a0,a4  // ptr[1]<<8 | ptr[0] ??? why ? want the opposite
        slli    a5,a0,8   // ptr[1]<<16 | ptr[0]<<8 | 0
        srli    a0,a0,8   // ptr[1]
        or      a0,a5,a0  // ptr[1]<<16 | ptr[0]<<8 | ptr[1]
        slli    a0,a0,16  // and 0xFFFF
        srli    a0,a0,16
        sw      a0,0(a1)  // ptr[0]<<8 | ptr[1]
        seqz    a0,a0
        ret

Actually, up to 12.x with -O1, the code is the same as 8.x/9.x
After that, with 13.x and -O1, a pair of useless slli/srli (i.e.
and 0xFFFF) is added :

test_swap:
        lbu     a5,0(a0)
        slli    a5,a5,8
        lbu     a4,1(a0)
        or      a0,a5,a4
        slli    a0,a0,16  // not necessary
        srli    a0,a0,16  // not necessary
        sw      a0,0(a1)
        seqz    a0,a0
        ret

Reply via email to