http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58358
--- Comment #22 from Chris Jefferson <chris at bubblescope dot net> --- Her are some comparisons. Just to compare, I also checked doing away with skipping optimisations altogether. Binary sizes (-O3, stripped) current head: 11928 my code: 12248 Mitsuru: 11384 No skip: 10904 Timing: current head: 3.70 my code: 3.70 Mitsuru: 4.04 No skip: 15.37 So we clearly want to do skipping. The tradeoff is between same speed and bigger executables (my code) or ~10% slower but saving 1K or so binary and some source (Mitsuru's code). I don't know what gcc/libstdc++'s general direction in that area is. I actually would expect Mitsuru's code to be faster (as it tries harder to skip forwards), but it is hard to predict how these things interact with optimisers/caches/branch predictors at a low level.