On 01/09/15 06:39, Jiong Wang wrote:
as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304
given the following test:
unsigned char byte = 0;
void set_bit(unsigned int bit, unsigned char value) {
unsigned char mask = (unsigned char)(1 << (bit & 7));
if (!value) {
byte &= (unsigned char)~mask;
} else {
byte |= mask;
}
}
we should generate something like:
set_bit:
and w0, w0, 7
mov w2, 1
lsl w2, w2, w0
while we are generating
mov w2, 1
lsl w2, w2, w0
the necessary "and w0, w0, 7" deleted wrongly.
that because
(insn 2 5 3 2 (set (reg/v:SI 82 [ bit ])
(reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64}
(expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
(nil)))
(insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ])
(and:SI (reg/v:SI 82 [ bit ])
(const_int 7 [0x7]))) bug.c:4 399 {andsi3}
(expr_list:REG_DEAD (reg/v:SI 82 [ bit ])
(nil)))
(insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ])
(ashift:SI (reg:SI 86)
(subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539
{*aarch64_ashl_sisd_or_int_si3}
(expr_list:REG_DEAD (reg:SI 86)
(expr_list:REG_DEAD (reg:SI 84 [ D.1482 ])
(expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1])
(subreg:QI (reg:SI 84 [ D.1482 ]) 0))
(nil)))))
are wrongly combined into
(insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ])
(ashift:QI (subreg:QI (reg:SI 86) 0)
(reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn}
(expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
(expr_list:REG_DEAD (reg:SI 86)
(nil))))
thus, the generated assembly is lack of the necessary "and w0, x0, 7".
the root cause is at one place in combine pass, we are passing wrong
bitmask to force_to_mode.
in this particular case, for QI mode, we should pass (1 << 8 - 1), while
we are passing (1 << 3 - 1),
thus the combiner think we only need the lower 3 bits, that X & 7 is
unnecessary. While for QI mode, we
want the lower 8 bits. we should remove the exp operator.
this should be a historical bug in combine pass?? while it's only
triggered for target
where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly
because x86/arm will
not trigger this part of code.
bootstrap on x86 and gcc check OK.
bootstrap on aarch64 and bare-metal regression OK.
ok for trunk?
gcc/
PR64303
* combine.c (combine_simplify_rtx): Correct the bitmask passed to
force_to_mode.
gcc/testsuite/
PR64303
* gcc.target/aarch64/pr64304.c: New testcase.
I don't think this is correct.
When I put a breakpoint on the code in question I see the following RTL
prior to the call to DO_SUBST:
(ashift:QI (const_int 1 [0x1])
(subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
(const_int 7 [0x7])) 0))
Note carefully the QImode for the ASHIFT. That clearly indicates that
just the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target
the masking of the count with 0x7 is redundant as the only valid shift
counts are 0-7 (again because of the QImode for the ASHIFT). Thus
that's equivalent to:
(ashift:QI (const_int 1 [0x1])
(reg:QI 0 x0 [ bit ]))
Similarly for the case:
(ashift:QI (subreg:QI (reg:SI 85) 0)
(subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
(const_int 7 [0x7])) 0))
Again, QImode ASHIFT, so the masking of the shift count is redundant
resulting in:
(ashift:QI (subreg:QI (reg:SI 85) 0)
(reg:QI 0 x0 [ bit ]))
I think you need to do some further analysis. Is it perhaps the case
that SHIFT_COUNT_TRUNCATED is nonzero when in fact it should be zero?
Jeff