http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55334
--- Comment #33 from Martin Jambor <jamborm at gcc dot gnu.org> --- I can confirm that one call of resid now gets inlined on the branch even on x86_64 (I'm confused why, the dump seems to suggest all call sites would violate param max-inline-insns-auto limit but then one gets inlined anyway) and we are 5.6% slower than if we also specify --param inline-min-speedup=17 (in addition to -Ofast). This is not a regression from 4.8.0. When I checked out the revision with that tag, I got exactly the same inlining and pretty much the same run time. As far as the cause of the slowdown is concerned, my simple greps suggest that vectorization happens anyway but as I wrote in comment 24, if we loose restrict we also lose opportunity to do hoisting of a loop invariant load so it is still likely that a lost restrict is the issue anyway.