On Thu, Jan 7, 2021 at 6:07 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Wed, Jan 6, 2021 at 10:32 PM Fangrui Song <i...@maskray.me> wrote: > > > > On Sat, Dec 26, 2020 at 7:39 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > On Sat, Dec 26, 2020 at 7:32 AM Florian Weimer <f...@deneb.enyo.de> wrote: > > > > > > > > * Fangrui Song: > > > > > > > > > Hi, I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 which > > > > > proposes -fdirect-access-external-data to address some x86-64 > > > > > GCC/binutils pain[1] and also benefit non-x86 architectures (also see > > > > > [1] > > > > > it can prevent copy relocations). > > > > > > > > > > [1] Mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112#c2 > > > > > > > > > > Since I am going to add this option to Clang and I hope (once GCC > > > > > decides to > > > > > implement this option the two compilers can use the same option > > > > > name), I bring > > > > > it to your attention. > > > > > > > > One worry I have is that people start building shared objects with > > > > direct data access, expecting the main program to be built with > > > > indirect access. We already see this today with Qt. It's not really > > > > supported well by the toolchain and causes frequent issues. > > > > > > It can be solved by ABI extension implemented in linker, ld.so and > > > compiler. > > > > > > > Depending on the ELF ABI in question, the new pair of -f options might > > > > not actually be meaningful. It really depends on whether you have > > > > reasonably-sized displacements available. I think there are some ABIs > > > > where the optimization is theoretically possible, but impractical > > > > because the ilimit it imposes on data segment (think AArch64 without > > > > adrp). > > > > > > > > > > > > -- > > > H.J. > > > > Please check out new comments on > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 > > > > -fdirect-access-external-data is still the best name. The option is > > useful to avoid copy relocations / "canonical PLT entry" > > (st_shndx=0,st_value!=0) in -fno-pic code. > > I will proceed with my Clang patch. > > If I understand it correctly, you want to treat all accesses to protected > definitions as local access > and all read/write accesses to undefined symbols > should go through GOT. Branches to undefined symbols can use PLT. > -fdirect-access-external-data doesn't reflect it.
My apologies. Direct/indirect access to protected definitions is a separate topic, unrelated to -f[no-]direct-access-external-data. ( If anyone is interested, there was a heated discussion about accesses to protected definitions https://sourceware.org/legacy-ml/binutils/2016-03/msg00312.html basically a lot of folks considered that copy relocations are best-effort support provided by the toolchain. For protected symbols, copy relocations do not necessarily work. Clang always treats protected similar to hidden/internal, no special logic for x86-64 protected. ) Branches to undefined symbols is yet another separate topic. ( On x86-64, there is no PIC vs non-PIC PLT distinction and an R_X86_64_PLT32 relocation is produced for both `call/jmp foo` and `call/jmp foo@PLT` with newer (2018) GNU as/LLVM integrated assembler. On i386, there are 2 types of PLTs, PIC and non-PIC. Currently the informal convention is to use R_386_PC32 for non-PIC PLT and R_386_PLT32 for PIC PLT, but R_386_PLT32 is arguably preferable for -fno-pic code as well: this can avoid a "canonical PLT entry" (st_shndx=0, st_value!=0) if the symbol turns out to be defined externally. My idea is that we can always use R_386_PLT32 in -fno-pic mode. ) Taking the address of an external function is related to -f[no-]direct-access-external-data. A function pointer of an external function is very similar to external data. A canonical PLT entry can be caused by either a branch (R_386_PC32/R_386_32) or an address taken operation (R_386_PC32/R_386_32) if the symbol turns out to be external. -fno-direct-access-external-data can only address the function pointer usage.