4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

vmakarov at gcc dot gnu.org Thu, 06 Jun 2013 08:20:26 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342


Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #7 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Yuri Rumyantsev from comment #2)
> The patching compiler produces better binaries but we still have -6%
> performance degradation on corei7. The main cause of it it that LRA compiler
> generates spill of 'pure' byte 'g' whereas old compiler generates spill for
> 'm' that is negation of 'g':
> 
> gcc wwithout LRA (assembly part the head of loop)
> 
> .L7:
>       movzbl  1(%edi), %edx
>       leal    3(%edi), %ebp
>       movzbl  (%edi), %ebx
>       movl    %ebp, %edi
>       notl    %edx   // perform negation on register
>       movb    %dl, 3(%esp)
> 
> gcc with LRA
> 
> .L7:
>       movzbl  (%edi), %ebx
>       leal    3(%edi), %ebp
>       movzbl  1(%edi), %ecx
>       movl    %ebp, %edi
>       movzbl  -1(%ebp), %edx
>       notl    %ebx
>       notl    %ecx
>       movb    %dl, (%esp)
>       cmpb    %cl, %bl
>       notb    (%esp) // perform nagation in memory
> 
> i.e. wwe have redundant load and store form/to stack.
> 
> I assume that this should be fixed also.

Fixing problem with notl needs implementing a new functionality in LRA: making
reloads which stays if the reload pseudo got a hard registers and was inherited
(in this case it is profitable).  Otherwise the current code should be
generated (the reloads and reload pseudos should be removed, the old code
should be restored).  I've started work on this but it will not be fixed
quickly as implementing the new functionality is not trivial task.

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

Reply via email to