Nikhil Benesch noticed that changes in the GCC backend were making the
use of defer functions that call recover less efficient.  A defer
thunk is a generated function that looks like this (this is the entire
function body):

    if !runtime.setdeferretaddr(&L) {
        deferredFunction()
    }
L:

The idea is that the address of the label passed to setdeferretaddr is
the address to which deferredFunction returns.  The code in canrecover
compares the return address of the function to this saved address to
see whether the recover function can return non-nil.  This is
explained in marginally more detail at
https://www.airs.com/blog/archives/376 .

When the return address does not match, the canrecover code does a
more costly check that requires unwinding the stack.  What Nikhil
Benesch noticed is that we were always taking that fallback.

It turned out that the label address passed to setdeferretaddr was not
the label to which the deferred function would return.  And that was
because the epilogue was being duplicated by the bb-reorder pass, and
the label was moved to one copy of the epilogue while the deferred
function returned to the other epilogue.

Of course there is no reason to duplicate the epilogue in such a small
function.  One easy way to disable that epilogue duplication is to
compile the function with -Os.  That is what this patch does.  This
patch compiles all thunks, not just defer thunks, with -Os, but since
they are all small that does no harm.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian

2019-02-13  Ian Lance Taylor  <i...@golang.org>

* go-gcc.cc: #include "opts.h".
(Gcc_backend::function): Compile thunks with -Os.
Index: go-gcc.cc
===================================================================
--- go-gcc.cc   (revision 268860)
+++ go-gcc.cc   (working copy)
@@ -25,6 +25,7 @@
 #include <gmp.h>
 
 #include "tree.h"
+#include "opts.h"
 #include "fold-const.h"
 #include "stringpool.h"
 #include "stor-layout.h"
@@ -3103,6 +3104,41 @@ Gcc_backend::function(Btype* fntype, con
       DECL_DECLARED_INLINE_P(decl) = 1;
     }
 
+  // Optimize thunk functions for size.  A thunk created for a defer
+  // statement that may call recover looks like:
+  //     if runtime.setdeferretaddr(L1) {
+  //         goto L1
+  //     }
+  //     realfn()
+  // L1:
+  // The idea is that L1 should be the address to which realfn
+  // returns.  This only works if this little function is not over
+  // optimized.  At some point GCC started duplicating the epilogue in
+  // the basic-block reordering pass, breaking this assumption.
+  // Optimizing the function for size avoids duplicating the epilogue.
+  // This optimization shouldn't matter for any thunk since all thunks
+  // are small.
+  size_t pos = name.find("..thunk");
+  if (pos != std::string::npos)
+    {
+      for (pos += 7; pos < name.length(); ++pos)
+       {
+         if (name[pos] < '0' || name[pos] > '9')
+           break;
+       }
+      if (pos == name.length())
+       {
+         struct cl_optimization cur_opts;
+         cl_optimization_save(&cur_opts, &global_options);
+         global_options.x_optimize_size = 1;
+         global_options.x_optimize_fast = 0;
+         global_options.x_optimize_debug = 0;
+         DECL_FUNCTION_SPECIFIC_OPTIMIZATION(decl) =
+           build_optimization_node(&global_options);
+         cl_optimization_restore(&global_options, &cur_opts);
+       }
+    }
+
   go_preserve_from_gc(decl);
   return new Bfunction(decl);
 }

Reply via email to