On 3/31/23 03:11, Vineet Gupta wrote:
Hi Jeff, Kito,
I need some ideas to proceed with PR/109279: pertaining to longer term
direction and short term fix.
First the executive summary:
long long f(void)
{
return 0x0101010101010101ull;
}
Up until gcc 12 this used to generate const pool type access.
lui a5,%hi(.LANCHOR0)
ld a0,%lo(.LANCHOR0)(a5)
ret
.LC0:
.dword 0x101010101010101
After commit 2e886eef7f2b ("RISC-V: Produce better code with complex
constants [PR95632] [PR106602] ") it gets synthesized to following
li a0,0x01010000
addi a0,0x0101
slli a0,a0,16
addi a0,a0,0x0101
slli a0,a0,16
addi a0,a0,0x0101
ret
Granted const pool could or not be preferred by specific uarch, will
the long term approach be to have a cost model for the const pool vs.
synthesizing.
The second aspect is to improve the horror above. Per chat on IRC,
pinskia suggested we relax the in_splitter constraint in
riscv_move_integer, as the combine issue holding it back is now fixed -
after commit 61bee6aed26eb30.
That beings it down to some what reasonable
li a5,0x01010000
addi a5,a5,0x0101
mv a0,a5
slli a5,a5,32
add a0,a5,a0
ret
I can spin a minimal patch, will that be acceptable for gcc 13.1 if it
is testsuite clean
It would seem to be a gcc-14 thing to me.
It seems like we probably should adjust the basic constant synthesis
code to handle this class of cases so that the initial RTL is good
rather than waiting on combine to fix it up. It looks like we need the
destination register as well as a temporary and a 5 instruction sequence.
I'm aware of uarch plans that would handle this kind of sequence
entirely in the front-end and pass off a single uop to the execution
units. We'd planned to dig into constant synthesis in support of that
effort. So I'm happy to help shepherd this improvement once gcc-14
development opens.
jeff