https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921

            Bug ID: 115921
           Summary: Missed optimization: and->ashift might be cheaper than
                    ashift->and on typical RISC targets
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lis8215 at gmail dot com
  Target Milestone: ---

At the moment GCC prefers 'ashift first' flavor pattern.

However, it might ends up in emitting expensive constants for subsequent AND
operation.
It might be cheaper to do AND operation first, since there's a chance to match
variant of AND operation which accepts immediate.
Example:

target_wide_uint_t test_ashift_and (target_wide_uint_t x)
{
  return (x & 0x3f) << 12;
}

godbolt results are the following:

[Xtensa ESP32-S3 gcc 12.2.0 (-O3)]
test_ashift_and:
        entry   sp, 32
        l32r    a8, .LC0
        slli    a2, a2, 12
        and     a2, a2, a8
        retw.n
        ; missed constant in output

[SPARC gcc 14.1.0 (-O3)]
test_ashift_and:
        sethi   %hi(258048), %g1
        sll     %o0, 12, %o0
        jmp     %o7+8
         and    %o0, %g1, %o0

[sh gcc 14.1.0 (-O3)]
_test_ashift_and:
        mov     r4,r0
        shll2   r0
        extu.b  r0,r0
        shll8   r0
        rts     
        shll2   r0

[s390x gcc 14.1.0 (-O3)]
test_ashift_and:
        larl    %r5,.L4
        sllg    %r2,%r2,12
        ng      %r2,.L5-.L4(%r5)
        br      %r14
.L4:
.L5:
        .quad   258048

[RISC-V (64-bit) gcc 14.1.0 (-O3)]
test_ashift_and:
        li      a5,258048
        slli    a0,a0,12
        and     a0,a0,a5
        ret

[mips (el) gcc 14.1.0 (-O3)]
test_ashift_and:
        li      $2,196608             # 0x30000
        sll     $4,$4,12
        ori     $2,$2,0xf000
        jr      $31
        and     $2,$4,$2

[mips64 (el) gcc 14.1.0 (-O3)]
test_ashift_and:
        li      $2,196608             # 0x30000
        dsll    $4,$4,12
        ori     $2,$2,0xf000
        jr      $31
        and     $2,$4,$2

[loongarch64 gcc 14.1.0 (-O3)]
test_ashift_and:
        lu12i.w $r12,258048>>12             # 0x3f000
        slli.d  $r4,$r4,12
        and     $r4,$r4,$r12
        jr      $r1

however, shifting to 33 got:
[mips64 (el) gcc 14.1.0 (-O3, ashift to 33)]
test_ashift_and:
        andi    $2,$4,0x3f
        jr      $31
        dsll    $2,$2,33

[SPARC64 gcc 14.1.0 (-O3, ashift to 33)]:
test_ashift_and:
        and     %o0, 63, %o0
        jmp     %o7+8
         sllx   %o0, 33, %o0


It seems like RISC-V (32-bit) is aware of that in trunk (14.1.0 won't):
[RISC-V (32-bit) gcc (trunk) (-O3)]
test_ashift_and:
        andi    a0,a0,63
        slli    a0,a0,12
        ret

while RV64 is not so good.

While this situation appears rarely in general, it appears 85 times in pcre2
matching routine, which is ~2% of the overall routine's code size (on mips32).

Also, it might be profitable to match any bitwise operator here: e.g. OR,XOR in
addition to AND.

Reply via email to