https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115529
Bug ID: 115529 Summary: Optimization with "bit mask and compare equality" ((x & C1) == C2), ((x | C3) == C4) Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: Explorer09 at gmail dot com Target Milestone: --- I thought this pattern is commonly seen, but I didn't find any issue report in GCC mentioning this. ```c #include <stdbool.h> bool pred_bitor(unsigned int x) { return (x | 0x1F) == 0x381F; } bool pred_bitand(unsigned int x) { return (x & ~(uint32_t)0x1F) == 0x3800; } bool pred_shift(unsigned int x) { return (x >> 5) == (0x3800 >> 5); } ``` The three predicates are equivalent, but GCC doesn't seem to recognize one can be converted to another. Clang does recognize that, however. My optimization request is this: For the patterns ((x & C1) == C2) and ((x | C3) == C4), where C1 to C4 are all compile time constants, try converting one code pattern to another, and figure out which can generate faster or smaller code (or maybe both). * If any constant happens to be already loaded into one register, then reusing that constant can save code size by not needing to load the constant into register for compare op. In other words, which pattern is the best depends a lot on the surrounding code, as well is CPU instruction sets. * My personal testing says that pred_bitand() can win in more cases than the other two: ** In 32-bit ARM, the constant 0x3800 can fit into an immediate operand but 0x381F cannot, which means there would be an additional "mov" instruction for 0x381F. ** In RISC-V, the "andi" has a 16-bit encoding (with the "C"/Compressed instruction extension) that works here, but there is no 16-bit encoding for "ori" instruction. * The pred_shift() case can work only if the mask constant is in the `((1 << N) - 1)` form, or `((unsigned)-1 << M)` form. It doesn't work with all mask constants but I think you knew that already.