https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65483
--- Comment #1 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Benchmarking build with -O3 -flto -Ofast -funroll-loops For mainline I get (running on input.graphic) real 0m35.673s user 0m35.556s sys 0m0.133s and setting early-inlining-insns=80 to get bsR/bsW inlined before we get LTO real 0m31.975s user 0m31.867s sys 0m0.124s -fno-ipa-cp: real 0m34.232s user 0m34.132s sys 0m0.117s For GCC 4.9 I get. real 0m32.719s user 0m32.615s sys 0m0.124s Oddly enought GCC 4.9 does not inlie bsR/bsW either.