https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63175
--- Comment #26 from Richard Biener <rguenth at gcc dot gnu.org> --- David - please clarify the cost of misaligned stores/loads. I tried to understand the PPC ISA document but can't really find the appropriate place where it talks about misalign cost (I only can see it still assumes element alignment). Btw with double-word alignment I get t.c:10:10: note: Cost model analysis: Vector inside of basic block cost: 8 Vector prologue cost: 0 Vector epilogue cost: 0 Scalar cost of basic block: 8 t.c:10:10: note: not vectorized: vectorization is not profitable. So currently on a tie we don't vectorize basic-blocks (same with GCC 4.8). That's kind of arbitrary, but given instruction encoding size on x86 for example it makes sense. Note that we seem to prefer optimized re-alignment loads over misaligned loads (even if double-word aligned) - the vectorizer is not set up to decide that based on costs (the misaligned load would cost 2 while the optimized re-aligned load costs 6 - two aligned loads (2), one vector stmt for mask compute (1) and one permute (3)). I think we are regression free compared to 4.8 (if you enable -fvect-cost-model there).