[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

rguenther at suse dot de via Gcc-bugs Thu, 27 Jan 2022 03:33:53 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178


--- Comment #20 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 27 Jan 2022, hubicka at kam dot mff.cuni.cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178
> 
> --- Comment #16 from hubicka at kam dot mff.cuni.cz ---
> > 
> > Yep, we also have code like
> > 
> > -       movabsq $0x3ff03db8fde2ef4e, %r8
> > ...
> > -       vmovq   %r8, %xmm11
> 
> It is loading random constant to xmm11.  Since reg<->xmm moves are
> relatively cheap it looks OK to me that we generate this.  Is it faster
> to load constant from the memory?

I would say so.  It saves code size and also uop space unless the two
can magically fuse to a immediate to %xmm move (I doubt that).

> >         movq    .LC11(%rip), %rax
> >         vmovq   %rax, %xmm14
> This is odd indeed and even more odd that we both movabs and memory load... 
> i386 FE plays some games with allowing some constants in SSE
> instructions (to allow simplification and combining) and split them out
> to memory later.  It may be consequence of this.

I've pasted the LRA dump pieces I think are relevant but I don't
understand them.  The constant load isn't visible originally but
is introduced by LRA so that may be the key to the mystery here.

[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

Reply via email to