https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111905
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- For the original testcase and foo we do not perform extra unrolling during vectorization - we just vectorize the already unrolled loop. bar isn't unrolled so we do as part of vectorization. With -fopt-info you see t.C:6:26: optimized: loop with 16 iterations completely unrolled (header execution count 63136016) t.C:7:14: optimized: basic block part vectorized using 32 byte vectors t.C:56:14: optimized: loop vectorized using 32 byte vectors t.C:56:14: optimized: loop versioned for vectorization because of possible aliasing t.C:56:14: optimized: loop vectorized using 16 byte vectors t.C:56:14: optimized: loop with 2 iterations completely unrolled (header execution count 57270721) t.C:51:6: optimized: loop turned into non-loop; it never loops I'm also not seeing any "terrible" code?