https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88797
--- Comment #3 from Cassio Neri <cassio.neri at gmail dot com> --- The attached file is running example that shows that performance is damaged. The code runs faster when test_f calls g instead of f where g is bool g(unsigned x, unsigned y) { if (x >= y) return false; return f(n, r); } even in the case where x < y and g does call f. Depending on #defines the example runs either f, g or both. These are the timings: $ g++ -O3 -o gcc_issue gcc_issue.cpp -D RUN_SIMPLE && time ./gcc_issue Running simple function... real 0m3.646s user 0m3.645s sys 0m0.000s $ g++ -O3 -o gcc_issue gcc_issue.cpp -D RUN_COMPLEX && time ./gcc_issue Running complex function... real 0m1.165s user 0m1.161s sys 0m0.003s $ g++ -O3 -o gcc_issue gcc_issue.cpp -D RUN_BOTH && time ./gcc_issue Running simple function... Running complex function... real 0m3.059s user 0m3.051s sys 0m0.007s Notice that run both is faster than running f only! This is so because then the compiler gives up inlining and calls the (good) generated code for f in isolation.