https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94145

            Bug ID: 94145
           Summary: Longcalls mis-optimize loading the function address
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I'm working on a feature where we convert some/all built-in function calls to
use the longcall sequence.  I discovered that the compiler is mis-optimizing
loading up the function address.  This showed up in the Spec 2017 wrf_r
benchmark where I replaced some 60,000 direct calls to longcalls.

In particular, the PowerPC backend is not marking the load of the function
address as being volatile.  This allows the compiler to move the load out of a
loop.

However with the current ELF semantics, you don't want to do this because the
function address changes.  The first call to the function, the address is the
PLT stub, but in subsequent calls it is the address of the function itself
after the shared library is loaded.

In addition, because UNSPECs are used, the compiler is likely to store the
function address in the stack and reload it.  Given that the UNSPEC is just a
load, it would be better not to optimize this to doing the extra load/store.

In fixing the linker bug that this feature uncovered, Alan Modra has a simple
patch to fix it.

Reply via email to