Hi Richard,

I assume that this fix does not affect on code size since such pattern
happens very rare although I can add a check on it if you insist.
Register pressure is not a issue here since I assume that additional
fill won't affect on performance as cmove with memory operand. I
decided to not change machine description of cmov (prohibit memory
operand0 since it is very risky but only did a simple change to not
produce such instruction during forward substitution (aka combine)
phase.

Best regards.
Yuri.


2012/12/12 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Dec 12, 2012 at 12:47 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Hi Uros,
>>
>> This fix is for all x86 platforms, we tested it on core2/corei7,
>> atom/atom2 and AMD and got performance improvement +6% -- +11%. So I
>> don' think we need to introduce additioanl tune feature.
>>
>> Sorry for my typo with gcc version - I ment mainline only since 4.7
>> does not use LRA.
>
> Btw, if it's a performance improvement all over the board why not adjust
> the patterns itself to not allow memory operands?  Thus, why restrict it
> to combine not creating this instruction?  (My -Os comment still stands)
>
> Richard.
>
>> Thanks.
>> Yuri.
>>
>>
>>
>> 2012/12/12 Uros Bizjak <ubiz...@gmail.com>:
>>> On Wed, Dec 12, 2012 at 12:27 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>> wrote:
>>>
>>>> This fix is aimed to remove performance degradation introduced by new
>>>> LRA phase that in fact is combining problem. Gcc combiner does
>>>> propagation of memory load to if-then-else gimple that was splitted
>>>> back by old reload phase. LRA does not perform such splitting. To
>>>> avoid performance slowdown on important benchmark (this is true for
>>>> all x86 targets) we decided to enhance 'ix86_legitimate_combined_insn'
>>>> with a check on such propagation and consider such conditional
>>>> instruction with memory operand as illegal one from performance point
>>>> of view.
>>>
>>> Is this true for all x86 targets? I have no objections to the
>>> implementation, but these fine-tunings should be declared in
>>> ix86_tune_features[] array, and used as conditions involving
>>> TARGET_xxx in the code. Please see many examples in the i386 source
>>> dir.
>>>
>>>> Is it OK for 4.8 and mainline?
>>>
>>> Hm, currently 4.8 _is_ mainline. Did you mean 4.7?
>>>
>>> Thanks,
>>> Uros.

Reply via email to