On Fri, Jan 9, 2015 at 12:40 PM, Jeff Law <l...@redhat.com> wrote: > On 01/09/15 06:39, Jiong Wang wrote: >> >> as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304 >> >> given the following test: >> >> unsigned char byte = 0; >> >> void set_bit(unsigned int bit, unsigned char value) { >> unsigned char mask = (unsigned char)(1 << (bit & 7)); >> if (!value) { >> byte &= (unsigned char)~mask; >> } else { >> byte |= mask; >> } >> } >> >> we should generate something like: >> >> set_bit: >> and w0, w0, 7 >> mov w2, 1 >> lsl w2, w2, w0 >> >> while we are generating >> mov w2, 1 >> lsl w2, w2, w0 >> >> >> the necessary "and w0, w0, 7" deleted wrongly. >> >> that because >> >> (insn 2 5 3 2 (set (reg/v:SI 82 [ bit ]) >> (reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64} >> (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ]) >> (nil))) >> (insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ]) >> (and:SI (reg/v:SI 82 [ bit ]) >> (const_int 7 [0x7]))) bug.c:4 399 {andsi3} >> (expr_list:REG_DEAD (reg/v:SI 82 [ bit ]) >> (nil))) >> (insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ]) >> (ashift:SI (reg:SI 86) >> (subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539 >> {*aarch64_ashl_sisd_or_int_si3} >> (expr_list:REG_DEAD (reg:SI 86) >> (expr_list:REG_DEAD (reg:SI 84 [ D.1482 ]) >> (expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1]) >> (subreg:QI (reg:SI 84 [ D.1482 ]) 0)) >> (nil))))) >> >> are wrongly combined into >> >> (insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ]) >> (ashift:QI (subreg:QI (reg:SI 86) 0) >> (reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn} >> (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ]) >> (expr_list:REG_DEAD (reg:SI 86) >> (nil)))) >> >> thus, the generated assembly is lack of the necessary "and w0, x0, 7". >> >> the root cause is at one place in combine pass, we are passing wrong >> bitmask to force_to_mode. >> >> in this particular case, for QI mode, we should pass (1 << 8 - 1), while >> we are passing (1 << 3 - 1), >> thus the combiner think we only need the lower 3 bits, that X & 7 is >> unnecessary. While for QI mode, we >> want the lower 8 bits. we should remove the exp operator. >> >> this should be a historical bug in combine pass?? while it's only >> triggered for target >> where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly >> because x86/arm will >> not trigger this part of code. >> >> bootstrap on x86 and gcc check OK. >> bootstrap on aarch64 and bare-metal regression OK. >> ok for trunk? >> >> gcc/ >> PR64303 >> * combine.c (combine_simplify_rtx): Correct the bitmask passed to >> force_to_mode. >> gcc/testsuite/ >> PR64303 >> * gcc.target/aarch64/pr64304.c: New testcase. > > I don't think this is correct. > > When I put a breakpoint on the code in question I see the following RTL > prior to the call to DO_SUBST: > > (ashift:QI (const_int 1 [0x1]) > (subreg:QI (and:SI (reg:SI 0 x0 [ bit ]) > (const_int 7 [0x7])) 0)) > > > Note carefully the QImode for the ASHIFT. That clearly indicates that just > the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target the > masking of the count with 0x7 is redundant as the only valid shift counts > are 0-7 (again because of the QImode for the ASHIFT). Thus that's > equivalent to: > > > (ashift:QI (const_int 1 [0x1]) > (reg:QI 0 x0 [ bit ])) > > > Similarly for the case: > > > (ashift:QI (subreg:QI (reg:SI 85) 0) > (subreg:QI (and:SI (reg:SI 0 x0 [ bit ]) > (const_int 7 [0x7])) 0)) > > > Again, QImode ASHIFT, so the masking of the shift count is redundant > resulting in: > > (ashift:QI (subreg:QI (reg:SI 85) 0) > (reg:QI 0 x0 [ bit ])) > > > I think you need to do some further analysis. Is it perhaps the case that > SHIFT_COUNT_TRUNCATED is nonzero when in fact it should be zero?
Jeff is correct here. SHIFT_COUNT_TRUNCATED cannot be true if the aarch64 back-end has shifts patterns for smaller than 32/64bit but the aarch64 target only has shifts for 32 and 64bit. The middle-end is doing the correct thing, with SHIFT_COUNT_TRUNCATED true and a pattern for a QIshift, means the shifter does not to be truncated before use. Thanks, Andrew Pinski > > Jeff > >