https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63175
--- Comment #29 from Bill Schmidt <wschmidt at gcc dot gnu.org> --- (In reply to Richard Biener from comment #26) > it makes sense. Note that we seem to prefer optimized re-alignment loads > over misaligned loads (even if double-word aligned) - the vectorizer is not > set > up to decide that based on costs (the misaligned load would cost 2 while > the optimized re-aligned load costs 6 - two aligned loads (2), one vector > stmt > for mask compute (1) and one permute (3)). My pending patch deals with this by having POWER8 pretend that it doesn't have a re-alignment load capability, enabling misaligned loads/stores, and setting the costs to reasonable values.