https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89369

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 45739
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45739&action=edit
gcc9-pr89369.patch

Untested fix.
The recently added patterns want to do x |= (unsigned) ((reg:DI) >> cnt), and
as the zero_extract in the patterns show, it always wants to extract exactly 32
bits.  So, in my understanding of the r<noxa>sbg instruction, we want to rotate
by 32 + cnt (correct in the pattern) and want to use 32,63 as I3/I4, so we only
do the operation on the low 32 bits.
In the testcase, rxsbg %r1,%r11,40,63,56 is emitted when we need to do:
unsigned long long var = 0x50ef69fef3e09994ULL;
unsigned r1 = ...;
r1 |= (unsigned) (var >> 8);
but that instruction instead of doing |= 0xfef3e099 we need is doing |=
0xf3e099, the top 8 bits are lost.  The patch changes it to rxsbg
%r1,%r11,32,63,56 which works properly.  The patch also changes 3 other rxsbg
instructions on the testcase, like rxsbg %r12,%r9,64,63,32 which were emitted
for |= (unsigned) (var >> 0), note the strange 64 in there, I bet it is xoring
also the upper bits of the 64-bit destination, but we don't really care that
much about those.  That said, the patch changes that to rxsbg
%r12,%r9,32,64,32.

Finally, I believe having 256 byte static buffer for each of these instructions
(times how many times it is expanded for <noxa>) is not a good idea, because it
increases .bss unnecessarily, plus the %ld in there will not really work e.g.
on cross compilers from 32-bit hosts to s390x-linux, while GEN_INT for these
small constants, while it creates rtxes, is all cached and likely already
constructed.

Reply via email to