http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47298
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-27 13:02:05 UTC --- On trunk we now vectorize the loop and then unroll it from cunroll. 4.6 -O2 -funroll-loops -ftree-vectorize -ffast-math: 10.7s 4.6 -O3 -funroll-loops -ftree-vectorize -ffast-math: 8.3s 4.7 -O2 -funroll-loops -ftree-vectorize -ffast-math: 7.4s 4.7 -O3 -funroll-loops -ftree-vectorize -ffast-math: 8.5s 4.8 -O2 -funroll-loops -ftree-vectorize -ffast-math: 6.1s 4.8 -O3 -funroll-loops -ftree-vectorize -ffast-math: 6.5s with -march=native added (iCore5) 4.8 -O2 ... -march=native: 3.9s 4.8 -O3 ... -march=native: 4s Apart from very minor scheduling differences I see no difference in code generation on trunk -O2 vs. -O3. I'd say "fixed" without more details.