https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118012
Bug ID: 118012 Summary: [avr] Expensive code (bit extract + extend + neg + and) instead of bit test Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 59841 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59841&action=edit gf2.c: C99 test case The test case gf2.c has a conditional xor: if (b & 1) c ^= a; $ avr-gcc-15 gf2.c -S -Os -save-temps -mmcu=atmega128 -dp compiles that to: * Extraction of b.1 (2 instructions) * Extend that to HImode (1 instruction) * Negate that value (3 instructions) * AND that value with "a" (2 instructions) mov r22,r20 ; 95 [c=4 l=2] *andqi3/3 andi r22,1<<0 ldi r23,0 ; 96 [c=4 l=1] movqi_insn/0 neg r23 ; 60 [c=12 l=3] *neghi2/0 neg r22 sbc r23,__zero_reg__ and r22,r18 ; 64 [c=4 l=1] *andqi3/0 and r23,r19 ; 65 [c=4 l=1] *andqi3/0 eor r24,r22 ; 69 [c=4 l=1] *xorqi3 eor r25,r23 ; 70 [c=4 l=1] *xorqi3 So we have 8 instructions / cycles where a simple bit-test would do (costs 2 instructions and 3 cycles at most). Apart from that, the "smart" version imposes a higher register pressure of 2 additional GPRs at least. What I can see is that the neghi2 is generated from a multiplication in expr.cc::expand_expr_real_2, case MULT_EXPR. For comparison, here is the code when the test is for bit 1 instead of bit 0: if (b & 1) c ^= a; sbrs r20,1 ; 75 [c=4 l=2] *sbrx_branchhi rjmp .L3 eor r24,r18 ; 73 [c=4 l=1] *xorqi3 eor r25,r19 ; 74 [c=4 l=1] *xorqi3 .L3: FYI, GCC v8 also performs a bit test for "if (b & 1)". Target: avr Configured with: --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --with-long-double=64 --disable-libcc1 --disable-shared --enable-languages=c,c++ Thread model: single Supported LTO compression algorithms: zlib gcc version 15.0.0 20241207 (experimental) (GCC)