2015-01-09 21:29 GMT+00:00 Andrew Pinski <pins...@gmail.com>: > On Fri, Jan 9, 2015 at 12:40 PM, Jeff Law <l...@redhat.com> wrote: >> On 01/09/15 06:39, Jiong Wang wrote: >>> >>> as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304 >>> >>> given the following test: >>> >>> unsigned char byte = 0; >>> >>> void set_bit(unsigned int bit, unsigned char value) { >>> unsigned char mask = (unsigned char)(1 << (bit & 7)); >>> if (!value) { >>> byte &= (unsigned char)~mask; >>> } else { >>> byte |= mask; >>> } >>> } >>> >>> we should generate something like: >>> >>> set_bit: >>> and w0, w0, 7 >>> mov w2, 1 >>> lsl w2, w2, w0 >>> >>> while we are generating >>> mov w2, 1 >>> lsl w2, w2, w0 >>> >>> >>> the necessary "and w0, w0, 7" deleted wrongly. >>> >>> that because >>> >>> (insn 2 5 3 2 (set (reg/v:SI 82 [ bit ]) >>> (reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64} >>> (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ]) >>> (nil))) >>> (insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ]) >>> (and:SI (reg/v:SI 82 [ bit ]) >>> (const_int 7 [0x7]))) bug.c:4 399 {andsi3} >>> (expr_list:REG_DEAD (reg/v:SI 82 [ bit ]) >>> (nil))) >>> (insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ]) >>> (ashift:SI (reg:SI 86) >>> (subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539 >>> {*aarch64_ashl_sisd_or_int_si3} >>> (expr_list:REG_DEAD (reg:SI 86) >>> (expr_list:REG_DEAD (reg:SI 84 [ D.1482 ]) >>> (expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1]) >>> (subreg:QI (reg:SI 84 [ D.1482 ]) 0)) >>> (nil))))) >>> >>> are wrongly combined into >>> >>> (insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ]) >>> (ashift:QI (subreg:QI (reg:SI 86) 0) >>> (reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn} >>> (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ]) >>> (expr_list:REG_DEAD (reg:SI 86) >>> (nil)))) >>> >>> thus, the generated assembly is lack of the necessary "and w0, x0, 7". >>> >>> the root cause is at one place in combine pass, we are passing wrong >>> bitmask to force_to_mode. >>> >>> in this particular case, for QI mode, we should pass (1 << 8 - 1), while >>> we are passing (1 << 3 - 1), >>> thus the combiner think we only need the lower 3 bits, that X & 7 is >>> unnecessary. While for QI mode, we >>> want the lower 8 bits. we should remove the exp operator. >>> >>> this should be a historical bug in combine pass?? while it's only >>> triggered for target >>> where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly >>> because x86/arm will >>> not trigger this part of code. >>> >>> bootstrap on x86 and gcc check OK. >>> bootstrap on aarch64 and bare-metal regression OK. >>> ok for trunk? >>> >>> gcc/ >>> PR64303 >>> * combine.c (combine_simplify_rtx): Correct the bitmask passed to >>> force_to_mode. >>> gcc/testsuite/ >>> PR64303 >>> * gcc.target/aarch64/pr64304.c: New testcase. >> >> I don't think this is correct. >> >> When I put a breakpoint on the code in question I see the following RTL >> prior to the call to DO_SUBST: >> >> (ashift:QI (const_int 1 [0x1]) >> (subreg:QI (and:SI (reg:SI 0 x0 [ bit ]) >> (const_int 7 [0x7])) 0)) >> >> >> Note carefully the QImode for the ASHIFT. That clearly indicates that just >> the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target the >> masking of the count with 0x7 is redundant as the only valid shift counts >> are 0-7 (again because of the QImode for the ASHIFT).
Thanks for the explain, I have misunderstood some key points here. > Jeff is correct here. SHIFT_COUNT_TRUNCATED cannot be true if the > aarch64 back-end has shifts patterns for smaller than 32/64bit but the > aarch64 target only has shifts for 32 and 64bit. Pinski, thanks for pointing this out. Agree. AArch64 define shift pattern for SHORT type should be the problem. Regards, Jiong > The middle-end is doing the correct thing, with SHIFT_COUNT_TRUNCATED > true and a pattern for a QIshift, means the shifter does not to be > truncated before use. > > Thanks, > Andrew Pinski > >> >> Jeff >> >>