On Wed, Apr 22, 2015 at 9:34 AM, H.J. Lu <hongjiu...@intel.com> wrote:
> Normally, with PIE, GCC accesses globals that are extern to the module
> using GOT.  This is two instructions, one to get the address of the global
> from GOT and the other to get the value.  Examples:
>
> ---
> extern int a_glob;
> int
> main ()
> {
>   return a_glob;
> }
> ---
>
> With PIE, the generated code accesses global via GOT using two memory
> loads:
>
>         movq    a_glob@GOTPCREL(%rip), %rax
>         movl    (%rax), %eax
>
> for 64-bit or
>
>         movl    a_glob@GOT(%ecx), %eax
>         movl    (%eax), %eax
>
> for 32-bit.
>
> Some experiments on google and SPEC CPU benchmarks show that the extra
> instruction affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are
> truly extern (come from shared objects), the linker will create copy
> relocations and have them defined in the executable.  Result is that
> no global access needs to go through GOT and hence improves performance.
> We can generate
>
>         movl    a_glob(%rip), %eax
>
> for 64-bit and
>
>         movl    a_glob@GOTOFF(%eax), %eax
>
> for 32-bit.  This optimization only applies to undefined non-weak
> non-TLS global data.  Undefined weak global or TLS data access still
> must go through GOT.
>
> This patch reverts legitimate_pic_address_disp_p change made in revision
> 218397, which only applies to x86-64.  Instead, this patch updates
> targetm.binds_local_p to indicate if undefined non-weak non-TLS global
> data is defined locally in PIE.  It also introduces a new target hook,
> binds_tls_local_p to distinguish TLS variable from non-TLS variable.  By
> default, binds_tls_local_p is the same as binds_local_p.
>
> This patch checks if 32-bit and 64-bit linkers support PIE with copy
> reloc at configure time.  64-bit linker is enabled in binutils 2.25
> and 32-bit linker is enabled in binutils 2.26.  This optimization
> is enabled only if the linker support is available.
>
> Tested on Linux/x86-64 with -m32 and -m64, using linkers with and without
> support for copy relocation in PIE.  OK for trunk?
>
> Thanks.
>
> H.J.
> ---
> gcc/
>
>         PR target/65846
>         * configure.ac (HAVE_LD_PIE_COPYRELOC): Renamed to ...
>         (HAVE_LD_64BIT_PIE_COPYRELOC): This.
>         (HAVE_LD_32BIT_PIE_COPYRELOC): New.   Defined to 1 if Linux/ia32
>         linker supports PIE with copy reloc.
>         * output.h (default_binds_tls_local_p): New.
>         (default_binds_local_p_3): Add 2 bool arguments.
>         * target.def (binds_tls_local_p): New target hook.
>         * varasm.c (decl_default_tls_model): Replace targetm.binds_local_p
>         with targetm.binds_tls_local_p.
>         (default_binds_local_p_3): Add a bool argument to indicate TLS
>         variable and a bool argument to indicate if an undefined non-TLS
>         non-weak data is local.  Double check TLS variable.  If an
>         undefined non-TLS non-weak data is local, treat it as defined
>         locally.
>         (default_binds_local_p): Pass false and false to
>         default_binds_local_p_3.
>         (default_binds_local_p_2): Likewise.
>         (default_binds_local_p_1): Likewise.
>         (default_binds_tls_local_p): New.
>         * config.in: Regenerated.
>         * configure: Likewise.
>         * doc/tm.texi: Likewise.
>         * config/i386/i386.c (legitimate_pic_address_disp_p): Don't
>         check HAVE_LD_PIE_COPYRELOC here.
>         (ix86_binds_local): New.
>         (ix86_binds_tls_local_p): Likewise.
>         (ix86_binds_local_p): Use it.
>         (TARGET_BINDS_TLS_LOCAL_P): New.
>         * doc/tm.texi.in (TARGET_BINDS_TLS_LOCAL_P): New hook.
>
> gcc/testsuite/
>
>         PR target/65846
>         * gcc.target/i386/pie-copyrelocs-1.c: Updated for ia32.
>         * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>         * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
>         * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>         * gcc.target/i386/pr32219-9.c: Likewise.
>         * gcc.target/i386/pr32219-10.c: New file.
>
>         * lib/target-supports.exp (check_effective_target_pie_copyreloc):
>         Check HAVE_LD_64BIT_PIE_COPYRELOC and HAVE_LD_32BIT_PIE_COPYRELOC
>         instead of HAVE_LD_64BIT_PIE_COPYRELOC.

Richard, Jeff,

Can you review this patch:

https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01331.html

Thanks.



-- 
H.J.

Reply via email to