On 3/31/23 03:11, Vineet Gupta wrote:
Hi Jeff, Kito,

I need some ideas to proceed with PR/109279: pertaining to longer term direction and short term fix.

First the executive summary:

long long f(void)
{
   return 0x0101010101010101ull;
}

Up until gcc 12 this used to generate const pool type access.

     lui    a5,%hi(.LANCHOR0)
     ld    a0,%lo(.LANCHOR0)(a5)
     ret
.LC0:
     .dword    0x101010101010101

After commit 2e886eef7f2b ("RISC-V: Produce better code with complex constants [PR95632] [PR106602] ") it gets synthesized to following

li    a0,0x01010000
     addi    a0,0x0101
     slli    a0,a0,16
     addi    a0,a0,0x0101
     slli    a0,a0,16
     addi    a0,a0,0x0101
     ret

Granted const pool could or not be preferred by  specific uarch, will the long term approach be to have a cost model for the const pool vs. synthesizing.

The second aspect is to improve the horror above. Per chat on IRC, pinskia suggested we relax the in_splitter constraint in riscv_move_integer, as the combine issue holding it back is now fixed - after commit 61bee6aed26eb30.

That beings it down to some what reasonable

     li        a5,0x01010000
     addi   a5,a5,0x0101
     mv     a0,a5
     slli      a5,a5,32
     add    a0,a5,a0
     ret

I can spin a minimal patch, will that be acceptable for gcc 13.1 if it is testsuite clean
It would seem to be a gcc-14 thing to me.

It seems like we probably should adjust the basic constant synthesis code to handle this class of cases so that the initial RTL is good rather than waiting on combine to fix it up. It looks like we need the destination register as well as a temporary and a 5 instruction sequence.

I'm aware of uarch plans that would handle this kind of sequence entirely in the front-end and pass off a single uop to the execution units. We'd planned to dig into constant synthesis in support of that effort. So I'm happy to help shepherd this improvement once gcc-14 development opens.

jeff

Reply via email to