https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117421

            Bug ID: 117421
           Summary: [RISCV] Use byte comparison instead of word comparison
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wojciech_mula at poczta dot onet.pl
  Target Milestone: ---

Consider this simple function:

---
#include <string_view>

bool ext_is_gzip(std::string_view ext) {
    return ext == "gzip";
}
---

For the x86 target, GCC nicely inlines compile-time constant, and produces the
code like that (it's from GCC 15, with `-O3 -march=icelake-server`):

---
ext_is_gzip(std::basic_string_view<char, std::char_traits<char> >):
        xorl    %eax, %eax
        cmpq    $4, %rdi
        je      .L5
        ret
.L5:
        cmpl    $1885960807, (%rsi)
        sete    %al
        ret
---

However, for the RISC-V target, GCC emits plain byte-by-byte comparison
(riscv64-unknown-linux-gnu-g++ (crosstool-NG UNKNOWN) 15.0.0 20241031
(experimental), with `-O3 -march=rv64gcv`):

---
ext_is_gzip(std::basic_string_view<char, std::char_traits<char> >):
        addi    sp,sp,-16
        sd      a0,0(sp)
        sd      a1,8(sp)
        li      a5,4
        beq     a0,a5,.L9
        li      a0,0
        addi    sp,sp,16
        jr      ra
.L9:
        lbu     a4,0(a1)
        li      a5,103
        beq     a4,a5,.L10
.L3:
        li      a0,1
.L4:
        xori    a0,a0,1
        addi    sp,sp,16
        jr      ra
.L10:
        lbu     a4,1(a1)
        li      a5,122
        bne     a4,a5,.L3
        lbu     a4,2(a1)
        li      a5,105
        bne     a4,a5,.L3
        lbu     a4,3(a1)
        li      a5,112
        li      a0,0
        beq     a4,a5,.L4
        li      a0,1
        j       .L4
---

My wild guess is that we have by default a high cost of placing huge
compile-time values in RISC-V. However, when I checked what is emitted for
"gzip" & "pizg" given as u32, then we have:

---
   0:   677a7537                lui     a0,0x677a7
   4:   9705051b                addiw   a0,a0,-1680 # 677a6970 

   8:   70698537                lui     a0,0x70698
   c:   a675051b                addiw   a0,a0,-1433 # 70697a67
---

A godbolt link for convenience: https://godbolt.org/z/e16bP369n.

Reply via email to