[Bug target/61915] [AArch64] High amounts of GP to FP register moves using LRA on AArch64

wdijkstr at arm dot com Fri, 24 Oct 2014 18:30:02 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915


--- Comment #16 from Wilco <wdijkstr at arm dot com> ---
(In reply to Andrew Pinski from comment #13)
> (In reply to Wilco from comment #9)
> > I committed a workaround
> > (http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00362.html) by increasing the
> > int<->fp move cost. Can you try this and check the issue has indeed gone?
> > You need -mcpu=cortex-a57.
> 
> Note when I submitted ThunderX support I used a base of 2 instead of a base
> of 1 due to 2 being the default and all values are relative to that.  This
> is mentioned in https://gcc.gnu.org/onlinedocs/gccint/Costs.html .  In fact
> a value of 2 means reload will not look at the constraints of a move
> instruction.
> 
> So I think the cortex* cpus should also re-base these values based on 2
> being gpr-to-gpr value.

You mean only use multiples of 2? That's interesting as I've not seen that done
elsewhere. Are these costs in any way related to real issue and latency cycles?
Most targets have complex tables with all the exact latencies for every little
uarch detail, but from what I've seen in the allocator these costs have almost
no meaning.

So did you find that setting the FP move cost so low actually works alright on
ThunderX? I'd like to figure out a setting for the generic target that works
out well across all AArch64 implementations.

[Bug target/61915] [AArch64] High amounts of GP to FP register moves using LRA on AArch64

Reply via email to