On Tue, Oct 4, 2011 at 11:58 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
> On Tue, Oct 4, 2011 at 11:51 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
>> On Tue, Oct 4, 2011 at 8:37 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>>
>>>>>> OTOH, x86_64 and i686 targets can also benefit from this change. If
>>>>>> combine can't create more complex address (covered by lea), then it
>>>>>> will simply propagate memory operand back into the add insn. It looks
>>>>>> to me that we can't loose here, so:
>>>>>>
>>>>>>  /* Improve address combine.  */
>>>>>>  if (code == PLUS && MEM_P (src2))
>>>>>>    src2 = force_reg (mode, src2);
>>>>>>
>>>>>> Any opinions?
>>>>>>
>>>>>
>>>>> It doesn't work with 64bit libstdc++:
>>>>
>>>> Yeah, yeah. ix86_output_mi_thunk has some ...  issues.
>>>>
>>>> Please try attached patch that introduces ix86_emit_binop and uses it
>>>> in a bunch of places.
>>
>>> I tried it on GCC.  There are no regressions.  The bugs are fixed for x32.
>>> Here are size comparison with GCC runtime libraries on ia32, x32 and
>>> x86-64:
>>
>>>  884093   18600   27064  929757   e2fdd old libstdc++.so
>>>  884189   18600   27064  929853   e303d new libs/libstdc++.so
>>>
>>> The new code is
>>>
>>> mov    0xc(%edi),%eax
>>> mov    %eax,0x8(%esi)
>>> mov    -0xc(%eax),%eax
>>> mov    0x10(%edi),%edx
>>> lea    0x8(%esi,%eax,1),%eax
>>>
>>> The old one is
>>>
>>> mov    0xc(%edi),%edx
>>> lea    0x8(%esi),%eax
>>> mov    %edx,0x8(%esi)
>>> add    -0xc(%edx),%eax
>>> mov    0x10(%edi),%edx
>>
>> The new code merged lea+add into one lea, so it looks quite OK to me.
>>
>> Do you have some performance numbers?
>>
>
> I will report performance numbers in a few days.

The differences in SPEC CPU 2006 on ia32, x86-64 and
x32 are within noise range.



-- 
H.J.

Reply via email to