http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #31 from oleg at smolsky dot net 2012-03-02 08:21:41 UTC --- I don't think there is a need to actually check the result in this benchmarkable fragment, so that will reduce the code a little. The only thing that I was hitting is about fooling/forcing the compiler not to discard the intermediate result and actually perform every calculation and iteration :) Let me try do digest this further. I'll also get you a result from our production compiler (v4.1 that emits the fastest code)