On Saturday, 21 January 2017 at 12:33:57 UTC, albert-j wrote:
Now I dmd -profile it and look at the performance of funcA with d-profile-viewer. Inside funcA, only 20% of time is spend in funcB, but the rest 80% is self-time of funcA. How is it possible, when funcB has three times the calculations of funcA? It appears that the call to funcB itself is very expensive.
I'm not sure if it's what happening in this case but, in code as simple as this, function calls can sometimes be the bottleneck. You should see how compiling with/without -O affects performance, and adding `pragma(inline)` to funcB.