> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the > module using the GOT. This is two instructions, one to get the address > of the global from the GOT and the other to get the value. If it turns > out that the global gets defined in the executable at link-time, it still > needs to go through the GOT as it is too late then to generate a direct > access. > > Examples: > > foo.cc > ------ > int a_glob; > int main () { > return a_glob; // defined in this file > } > > With -O2 -fpie -pie, the generated code directly accesses the global via > PC-relative insn: > > 5e0 <main>: > mov 0x165a(%rip),%eax # 1c40 <a_glob> > > foo.cc > ------ > > extern int a_glob; > int main () { > return a_glob; // defined in this file > } > > With -O2 -fpie -pie, the generated code accesses global via GOT using > two memory loads: > > 6f0 <main>: > mov 0x1609(%rip),%rax # 1d00 <_DYNAMIC+0x230> > mov (%rax),%eax > > This is true even if in the latter case the global was defined in the > executable through a different file. > > Some experiments on google benchmarks shows that the extra memory loads > affects performance by 1% to 5%. > > Solution - Copy Relocations: > > When the linker supports copy relocations, GCC can always assume that > the global will be defined in the executable. For globals that are truly > extern (come from shared objects), the linker will create copy relocations > and have them defined in the executable. Result is that no global access > needs to go through the GOT and hence improves performance. > > This optimization only applies to undefined, non-weak global data. > Undefined, weak global data access still must go through the GOT. > > This patch checks if linker supports PIE with copy reloc, which is > enabled in gold and bfd linker in bininutils 2.25, at configure time > and enables this optimization if the linker support is available. > > gcc/ > > * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if > Linux/x86-64 linker supports PIE with copy reloc. > * config.in: Regenerated. > * configure: Likewise. > > * config/i386/i386.c (legitimate_pic_address_disp_p): Allow > pc-relative address for undefined, non-weak, non-function > symbol reference in 64-bit PIE if linker supports PIE with > copy reloc. > > * doc/sourcebuild.texi: Document pie_copyreloc target. > > gcc/testsuite/ > > * gcc.target/i386/pie-copyrelocs-1.c: New test. > * gcc.target/i386/pie-copyrelocs-2.c: Likewise. > * gcc.target/i386/pie-copyrelocs-3.c: Likewise. > * gcc.target/i386/pie-copyrelocs-4.c: Likewise. > > * lib/target-supports.exp (check_effective_target_pie_copyreloc): > New procedure.
It caused pr64189. Dominique.