https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #18 from Alexander Peslyak <solar-gcc at openwall dot com> --- (In reply to Richard Biener from comment #11) > Note that we have to use movups because DES_bs_all is not aligned as seen > from DES_bs_b.c (it's defined in DES_bs.c and only there annotated with > CC_CACHE_ALIGN, not at the point of declaration in DES_bs.h). So the > unaligned moves are the sources fault. Annotating that with CC_CACHE_ALIGN > produces the desired movaps instructions Confirmed also with GCC 4.9.2 on JtR 1.8.0's version of the code. > (with no effect on performance for me). ... with the expected performance improvement for me. I'll commit this fix. Thanks again!