https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88271
--- Comment #3 from Daniel Fruzynski <bugzi...@poradnik-webmastera.com> --- What about adding new pass at the end? It would look for various possible optimizations, which were missed earlier because they are cross-basic block. In my case this example code is part of tight loop. From previous experiences with it I expect that this optimization could improve speed by something like 0.5%-1%. If you want to look on real code, is it at link below. CPU spends about 60% of time in this one function. This app runs on BOINC platform, so such microoptimization would be worthwhile there. https://github.com/sirzooro/RakeSearch/blob/optimizations2/RakeDiagSearch/RakeDiagSearch/MovePairSearch.cpp#L583