Am 20.11.24 um 11:33 schrieb Richard Biener:
On Wed, Nov 20, 2024 at 11:29 AM Georg-Johann Lay via Gcc
<gcc@gcc.gnu.org> wrote:
Consider the following RTL peephole from avr.md:
(define_peephole2 ; avr.md:5387
[(match_scratch:QI 3 "d")
(parallel [(set (match_operand:ALL4 0 "register_operand" "")
(ashift:ALL4 (match_operand:ALL4 1
"register_operand" "")
(match_operand:QI 2 "const_int_operand"
"")))
(clobber (reg:CC REG_CC))])]
""
[(parallel [(set (match_dup 0)
(ashift:ALL4 (match_dup 1)
(match_dup 2)))
(clobber (match_dup 3))
(clobber (reg:CC REG_CC))])])
Finally, Andrew Pinski pointed out the correct solution:
In order for the scratch from the peephole2 to not overlap
with the destination of the matched insn, a top-level
match_dup is required according to
https://gcc.gnu.org/onlinedocs/gccint/define_005fpeephole2.html
So the fixed peephole2 reads:
(define_peephole2
[(match_scratch:QI 3 "d")
(parallel [(set (match_operand:ALL4 0 "register_operand")
(ashift:ALL4 (match_operand:ALL4 1 "register_operand")
(match_operand:QI 2 "const_int_operand")))
(clobber (reg:CC REG_CC))])
(match_dup 3)]
This is required due to my changes which turned the shift insn from a
2-operand insn to a 3-operand one. The 2-operand case was not an
issue because AFAIU peephole2 makes sure that the scratch won't
overlap with any input.
Johann
As far as I understand, its purpose is to provide a QImode
scratch register provided such a scratch is available.
However, in the .peephole2 RTL dump with -da I see the following:
Splitting with gen_peephole2_100 (avr.md:5387)
...
(insn 24 8 15 2 (parallel [
(set (reg:SI 22 r22 [orig:47 _3 ] [47])
(ashift:SI (reg:SI 20 r20 [orig:48 x ] [48])
(const_int 7 [0x7])))
(clobber (reg:QI 24 r24))
(clobber (reg:CC 36 cc))
])
(nil))
That is, the scratch r24:QI is overlapping the output in
r22:SI. All hard registers are 8-bit regs and hence r22:SI
extends from r22...r25.
A scratch that overlaps the operands is pretty much useless
or even plain wrong. recog.cc::peep2_find_free_register()
has this comment: /* Don't use registers set or clobbered by the insn. */
from = peep2_buf_position (peep2_current + from);
to = peep2_buf_position (peep2_current + to);
gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
REG_SET_TO_HARD_REG_SET (live, peep2_insn_data[from].live_before);
while (from != to)
{
gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
/* Don't use registers set or clobbered by the insn. */
FOR_EACH_INSN_DEF (def, peep2_insn_data[from].insn)
SET_HARD_REG_BIT (live, DF_REF_REGNO (def));
from = peep2_buf_position (from + 1);
}
So it this bogus in that it assumes all registers extend only
over one hard reg?
Yes, looks like a bug to me.
FYI, the purpose is to provide a scratch without increasing the register
pressure (which "match_scratch" would do). Therefore, the RTL peephole
is used instead of forcing reload to come up with a scratch.
More specifically, I see this with
$ avr-gcc bogus-peep2.c -S -Os -da
long ashl32_7 (int i, long x)
{
return x << 7;
}
with the attached WIP patch atop trunk b222ee10045d.
Johann
Target: avr
Configured with: ../../source/gcc-master/configure --target=avr
--disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--enable-languages=c,c++
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241119 (experimental) (GCC)