http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Vladimir Makarov <vmakarov at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at redhat dot com

--- Comment #6 from Vladimir Makarov <vmakarov at redhat dot com> 2011-12-09 
19:09:52 UTC ---
There is small difference in the code which results in such degradation.

-O1 generates an insn in the major loop

(insn 43 42 44 5 /home/cygnus/vmakarov/build1/trunk/crctest64.c:241 (parallel [
            (set (reg/v:SI 77 [ __tab_index ])
        (xor:SI (reg:SI 108)
                    (reg:SI 120)))
            (clobber (reg:CC 17 flags))
        ]) 395 {*xorsi_1} (expr_list:REG_DEAD (reg:SI 108)
        (expr_list:REG_DEAD (reg:SI 120)
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))

-O2 generates analogous insn

(insn 39 38 40 5 /home/cygnus/vmakarov/build1/trunk/crctest64.c:241 (parallel [
            (set (reg/v:SI 83 [ __tab_index ])
                (xor:SI (reg/v:SI 83 [ __tab_index ])
                    (reg:SI 143)))
            (clobber (reg:CC 17 flags))
        ]) 395 {*xorsi_1} (expr_list:REG_DEAD (reg:SI 143)
    (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

The reason for the difference because of regmove optimization.

The RTL insn in the second variant looks even better but it makes
pseudo 83 most frequently used and assigned first by pushing it last
to the coloring stack between bunch trivially colorable pseudos.  The
set of trivially colorable pseudos contains two double word pseudos
which need two adjacent hard registers each.  Assigning pseudo 83
first (the case is complicated more because some pseudos cross calls)
results in presence of only one pair of adjacent hard registers
although there are still 2 free hard register for the second double
word pseudos but they are not adjacent.  It results in spilling of one
double word pseudo and code performance degradation.

For -O1 analog pseudo 83 (p77) is assigned last after assigning to two
double word pseudos and spilling does not occur.

To solve the problem we should increase probability of keeping free
hard registers adjacent.  It can be done by pushing multi-word pseudos
last to the coloring stack and as consequence to assign them first by
modifying function bucket_allocno_compare_func.  I did the problem was
solved unfortunately, it results in 2% performance degradation of
SPEC2000 perlbmk although there is a small code size improvement on
SPEC2000 with this heuristic.

On a general note, RA allocation is all about heuristics.  So it is
possible to find a test where it will work worse than other
heuristics.  The most important that RA works well in overall (on big
credible set of tests).  With this point of view IRA is much better
than the previous register allocator.

But because crc code is important, I'll continue the work on tuning
which does not degrade SPEC2000 and which does solve problem.

Reply via email to