https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921
Bug ID: 115921 Summary: Missed optimization: and->ashift might be cheaper than ashift->and on typical RISC targets Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: lis8215 at gmail dot com Target Milestone: --- At the moment GCC prefers 'ashift first' flavor pattern. However, it might ends up in emitting expensive constants for subsequent AND operation. It might be cheaper to do AND operation first, since there's a chance to match variant of AND operation which accepts immediate. Example: target_wide_uint_t test_ashift_and (target_wide_uint_t x) { return (x & 0x3f) << 12; } godbolt results are the following: [Xtensa ESP32-S3 gcc 12.2.0 (-O3)] test_ashift_and: entry sp, 32 l32r a8, .LC0 slli a2, a2, 12 and a2, a2, a8 retw.n ; missed constant in output [SPARC gcc 14.1.0 (-O3)] test_ashift_and: sethi %hi(258048), %g1 sll %o0, 12, %o0 jmp %o7+8 and %o0, %g1, %o0 [sh gcc 14.1.0 (-O3)] _test_ashift_and: mov r4,r0 shll2 r0 extu.b r0,r0 shll8 r0 rts shll2 r0 [s390x gcc 14.1.0 (-O3)] test_ashift_and: larl %r5,.L4 sllg %r2,%r2,12 ng %r2,.L5-.L4(%r5) br %r14 .L4: .L5: .quad 258048 [RISC-V (64-bit) gcc 14.1.0 (-O3)] test_ashift_and: li a5,258048 slli a0,a0,12 and a0,a0,a5 ret [mips (el) gcc 14.1.0 (-O3)] test_ashift_and: li $2,196608 # 0x30000 sll $4,$4,12 ori $2,$2,0xf000 jr $31 and $2,$4,$2 [mips64 (el) gcc 14.1.0 (-O3)] test_ashift_and: li $2,196608 # 0x30000 dsll $4,$4,12 ori $2,$2,0xf000 jr $31 and $2,$4,$2 [loongarch64 gcc 14.1.0 (-O3)] test_ashift_and: lu12i.w $r12,258048>>12 # 0x3f000 slli.d $r4,$r4,12 and $r4,$r4,$r12 jr $r1 however, shifting to 33 got: [mips64 (el) gcc 14.1.0 (-O3, ashift to 33)] test_ashift_and: andi $2,$4,0x3f jr $31 dsll $2,$2,33 [SPARC64 gcc 14.1.0 (-O3, ashift to 33)]: test_ashift_and: and %o0, 63, %o0 jmp %o7+8 sllx %o0, 33, %o0 It seems like RISC-V (32-bit) is aware of that in trunk (14.1.0 won't): [RISC-V (32-bit) gcc (trunk) (-O3)] test_ashift_and: andi a0,a0,63 slli a0,a0,12 ret while RV64 is not so good. While this situation appears rarely in general, it appears 85 times in pcre2 matching routine, which is ~2% of the overall routine's code size (on mips32). Also, it might be profitable to match any bitwise operator here: e.g. OR,XOR in addition to AND.