Hi Uros,

   Could you please review this patch?

Thanks
Sri

On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam <tmsri...@google.com> wrote:
> Patch Updated.
>
> Sri
>
> On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam <tmsri...@google.com> wrote:
>> Ping.
>>
>> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam <tmsri...@google.com> 
>> wrote:
>>> Ping.
>>>
>>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam <tmsri...@google.com> 
>>> wrote:
>>>> Optimize access to globals with -fpie, x86_64 only:
>>>>
>>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the 
>>>> module
>>>> using the GOT.  This is two instructions, one to get the address of the 
>>>> global
>>>> from the GOT and the other to get the value.  If it turns out that the 
>>>> global
>>>> gets defined in the executable at link-time, it still needs to go through 
>>>> the
>>>> GOT as it is too late then to generate a direct access.
>>>>
>>>> Examples:
>>>>
>>>> foo.cc
>>>> ------
>>>> int a_glob;
>>>> int main () {
>>>>   return a_glob; // defined in this file
>>>> }
>>>>
>>>> With -O2 -fpie -pie, the generated code directly accesses the global via
>>>> PC-relative insn:
>>>>
>>>> 5e0   <main>:
>>>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>>>
>>>> foo.cc
>>>> ------
>>>>
>>>> extern int a_glob;
>>>> int main () {
>>>>   return a_glob; // defined in this file
>>>> }
>>>>
>>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two
>>>> memory loads:
>>>>
>>>> 6f0  <main>:
>>>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>>>    mov    (%rax),%eax
>>>>
>>>> This is true even if in the latter case the global was defined in the
>>>> executable through a different file.
>>>>
>>>> Some experiments on google benchmarks shows that the extra memory loads 
>>>> affects
>>>> performance by 1% to 5%.
>>>>
>>>>
>>>> Solution - Copy Relocations:
>>>>
>>>> When the linker supports copy relocations, GCC can always assume that the
>>>> global will be defined in the executable.  For globals that are truly 
>>>> extern
>>>> (come from shared objects), the linker will create copy relocations and 
>>>> have
>>>> them defined in the executable. Result is that no global access needs to go
>>>> through the GOT and hence improves performance.
>>>>
>>>> This patch to the gold linker :
>>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>> submitted recently allows gold to generate copy relocations for -pie mode 
>>>> when
>>>> necessary.
>>>>
>>>> I have added option -mld-pie-copyrelocs which when combined with -fpie 
>>>> would do
>>>> this.  Note that the BFD linker does not support pie copyrelocs yet and 
>>>> this
>>>> option cannot be used there.
>>>>
>>>> Please review.
>>>>
>>>>
>>>> ChangeLog:
>>>>
>>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option.
>>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
>>>>  address is still legitimate in the presence of copy relocations
>>>>  and -fpie.
>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.
>>>>
>>>>
>>>>
>>>> Patch attached.
>>>> Thanks
>>>> Sri

Reply via email to