http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863

--- Comment #4 from Ali Baharev <ali.baharev at gmail dot com> ---
My mistake, sorry. 

So, you are saying that the default alignment is 8 byte for loops?

The funny thing is, this code runs 15% faster, if any of the followings are
passed:

 -Os
 -O2 -fno-align-loops -fno-align-functions
 -O2 -fno-omit-frame-pointer

At least on my machine and in this case, 16 byte alignment is better (or any
multiple of 16 byte). -march=native has no effect on the performance.

Reply via email to