https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #9 from Andrew Pinski ---
In GCC 5-8 we produced:
vpcmpeqd%ymm2, %ymm2, %ymm2
vpsllq $63, %ymm2, %ymm2
vandnpd %ymm1, %ymm2, %ymm1
vandpd %ymm2, %ymm0, %ymm0
vorpd %ymm1, %ymm0, %ymm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #8 from Maxim Egorushkin ---
Another example
https://stackoverflow.com/questions/61975526/gcc-optimization-better-at-o0-than-o3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
Martin Liška changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #7 from Marc Glisse ---
(In reply to Maxim Egorushkin from comment #3)
> It seems to me that register allocation has been a weak spot in gcc for
> years.
Most such testcases show issues with arguments/return in very small functions,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #6 from Jakub Jelinek ---
And as for the constant, seems ICC also emits just constant load from memory
instead of trying two instructions instead and clang, while it uses broadcast
to save .rodata, doesn't use two instructions either:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #5 from Jakub Jelinek ---
Wasn't the whole point of Segher's combiner changes not to propagate hard
registers into instructions to leave the RA more in control?
Propagating something in some other pass would undo that change.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #4 from H.J. Lu ---
Since fwprop.c has
static rtx
propagate_rtx (rtx x, machine_mode mode, rtx old_rtx, rtx new_rtx,
bool speed)
{
rtx tem;
bool collapsed;
int flags;
if (REG_P (new_rtx) && REGNO (new_rtx) < F
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #3 from Maxim Egorushkin ---
It seems to me that register allocation has been a weak spot in gcc for years.
gcc often allocates registers in such a way that extra register moves are
necessary, compared to competition, like in this p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
Jakub Jelinek changed:
What|Removed |Added
CC||hjl.tools at gmail dot com,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91796
--- Comment #1 from Maxim Egorushkin ---
In addition, the code tries to generate avx_signbit using 2 instructions:
comparision vpcmpeqq and shift vpsllq to avoid loading anything from memory.
However, the compiler replaces the code with loading a
11 matches
Mail list logo