https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94145
Bug ID: 94145 Summary: Longcalls mis-optimize loading the function address Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- I'm working on a feature where we convert some/all built-in function calls to use the longcall sequence. I discovered that the compiler is mis-optimizing loading up the function address. This showed up in the Spec 2017 wrf_r benchmark where I replaced some 60,000 direct calls to longcalls. In particular, the PowerPC backend is not marking the load of the function address as being volatile. This allows the compiler to move the load out of a loop. However with the current ELF semantics, you don't want to do this because the function address changes. The first call to the function, the address is the PLT stub, but in subsequent calls it is the address of the function itself after the shared library is loaded. In addition, because UNSPECs are used, the compiler is likely to store the function address in the stack and reload it. Given that the UNSPEC is just a load, it would be better not to optimize this to doing the extra load/store. In fixing the linker bug that this feature uncovered, Alan Modra has a simple patch to fix it.