https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533
--- Comment #29 from Mikhail Maltsev <maltsevm at gmail dot com> --- Results for attached testcase: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (Haswell) g++ -O3 -march=native -mtune=native 10000 iterations Clang 3.7 Total absolute time for int32_t for loop unrolling: 0.99 sec Total absolute time for int32_t do loop unrolling: 1.00 sec Total absolute time for double for loop unrolling: 1.37 sec Total absolute time for double do loop unrolling: 1.37 sec GCC 4.7.4 Total absolute time for int32_t for loop unrolling: 5.88 sec Total absolute time for int32_t do loop unrolling: 7.57 sec Total absolute time for double for loop unrolling: 2.29 sec Total absolute time for double do loop unrolling: 2.45 sec GCC 4.8.4 Total absolute time for int32_t for loop unrolling: 3.12 sec Total absolute time for int32_t do loop unrolling: 3.29 sec Total absolute time for double for loop unrolling: 1.13 sec Total absolute time for double do loop unrolling: 1.14 sec GCC 4.9.2 Total absolute time for int32_t for loop unrolling: 3.02 sec Total absolute time for int32_t do loop unrolling: 3.29 sec Total absolute time for double for loop unrolling: 1.10 sec Total absolute time for double do loop unrolling: 1.13 sec GCC 6 Total absolute time for int32_t for loop unrolling: 5.95 sec Total absolute time for int32_t do loop unrolling: 6.95 sec Total absolute time for double for loop unrolling: 2.39 sec Total absolute time for double do loop unrolling: 2.39 sec g++ -DINLINE_MANUALLY -O3 -march=native -mtune=native 50000 iterations Clang 3.7 Total absolute time for int32_t for loop unrolling: 2.43 sec Total absolute time for int32_t do loop unrolling: 2.32 sec Total absolute time for double for loop unrolling: 6.38 sec Total absolute time for double do loop unrolling: 6.38 sec GCC 4.9.2 Total absolute time for int32_t for loop unrolling: 10.17 sec Total absolute time for int32_t do loop unrolling: 10.16 sec Total absolute time for double for loop unrolling: 3.89 sec Total absolute time for double do loop unrolling: 3.90 sec GCC 6 Total absolute time for int32_t for loop unrolling: 10.10 sec Total absolute time for int32_t do loop unrolling: 10.12 sec Total absolute time for double for loop unrolling: 3.90 sec Total absolute time for double do loop unrolling: 3.89 sec g++ -DINLINE_MANUALLY -Ofast -march=native -mtune=native GCC 6 Total absolute time for int32_t for loop unrolling: 10.11 sec Total absolute time for int32_t do loop unrolling: 10.11 sec Total absolute time for double for loop unrolling: 1.14 sec Total absolute time for double do loop unrolling: 1.15 sec So, IMHO there is no regression here (at least w.r.t. vectorization). Floating point loop gets constant-folded, if reassociation is allowed. Also, GCC6 is able to infer that "for" and "while" tests are semantically equivalent and unifies them.