Hi there,

I found the C++11 code below:
  int Fib(int n) {
      if (n <= 1) return n;
      return [&] { return Fib(n-2) + Fib(n-1); }();
  }

is ~2x faster than normal one:
  int Fib(int n) {
      if (n <= 1) return n;
      return Fib(n-2) + Fib(n-1);
  }

I tested them with "-std=c++11 -O3/-O2" using trunk and the first
version is ~2x (1.618x in theory?) faster. However, the first version
has larger binary size (101k compared to 3k, which is for the second
version). Clang produces 4k for the first version (with similar speed
improvement) though.

My guess is that the first `if (n <= 1) return n;` is easier to inline
into the caller side, since the returned expression is a call to a
separated function. It's translated to something like (ignoring
linkage difference):
  int foo(int n);
  int Fib(int n) {
      if (n <= 1) {
          return n;
      }
      return foo(n);
  }
  int foo(int n) {
      return Fib(n-2) + Fib(n-1);
  };

After inline optimizations, it's translated to:
  int foo(int n);
  int Fib(int n) {
      if (n <= 1) {
          return n;
      }
      return foo(n);
  }
  int foo(int n) {
      return (n-2<=1) ? n-2 : foo(n-2) + (n-1<=1) ? n-1 : foo(n-1);
  };

As a result, the maximum depth of the stack reduces by 1, since all
boundary checkings (if (n <= 1) return n;) are done by the caller
side, which may eliminate unnecessary function call overhead.

To me the optimization should be: For a given recursive function A,
split it into function B and C, so that A is equivalent to { B();
return C(); }, where B should be easy to inline (e.g. no recursive
calls) and C may not.

Is it possible/reasonable to do such an optimization? I hope it can help. :)

Thanks!


-- 
Regards,
Tim Shen

Reply via email to