Hi, I neglected to write about RA changes for the previous releases and people asked me to write about RA changes for GCC-5. So here is what I'd like to add to gcc-5/changes.html. I'll do it tomorrow. So any comments will be appreciated.

  Thanks.
Index: changes.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.91
diff -U 5 -r1.91 changes.html
--- changes.html	23 Mar 2015 10:12:23 -0000	1.91
+++ changes.html	26 Mar 2015 19:24:32 -0000
@@ -95,10 +95,40 @@
       <li>The new <code>gcov-tool</code> utility allows manipulating
           profiles.</li>
       <li>Profiles are now more tolerant to source file changes (this can be
 	  controlled by <code>--param profile-func-internal-id</code>).</li>
     </ul></li>
+    <li>Register allocation improvements:
+    <ul>
+      <li>A new local register allocator (LRA) sub-pass was added.
+          The sub-pass implements control-flow sensitive global
+          register rematerialization (controlled via
+	  <code>-flra-remat</code>).  Instead of spilling and
+          restoring register value, it is recalculated if it is
+          profitable.  The sub-pass improved SPEC2000 generated code
+          by 1% and 0.5% correspondingly on ARM and x86-64.</li>
+      <li>In GCC-4.9 and earlier releases PIC hard register was fixed
+          and was not used for other purposes when PIC code was
+          generated.  Reuse of PIC hard register was implemented in RA
+          for GCC-5.0.  It improves generated PIC code performance as
+          more hard registers can be used.  As an example, shared
+          libraries and OS Android would significantly benefit from
+          such optimization.  Currently it is switched on only for
+          x86/x86-64 targets.  As RA infrastructure is already
+          implemented for PIC register reuse, other targets might
+          follow this in the future.</li>
+      <li>A simple form of inter-procedural RA was implemented.  When
+          it is known that a called function does not use caller saved
+          registers, save/restore code is not generated around the
+          call for such registers. This optimization can be controlled
+          by <code>-fipa-ra</code></li>
+      <li>On some architectures (e.g. modern Intel processors),
+          spilling general registers into vector registers can be more
+          profitable than spilling into memory.  LRA had already such
+          optimization.  It was significantly improved for GCC-5.0,
+          permitting more 85% such spills than in GCC-4.9.</li>
+    </ul></li>
     <li>UndefinedBehaviorSanitizer gained a few new sanitization options:
     <ul>
       <li><code>-fsanitize=float-divide-by-zero</code>: detect floating-point
 	   division by zero;</li>
       <li><code>-fsanitize=float-cast-overflow</code>: check that the result

Reply via email to