https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108322
--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> --- (In reply to Richard Biener from comment #4) > > For the case at hand loading two vectors from the destination and then > punpck{h,l}bw and storing them again might be the most efficient thing > to do here. I think such read-modify-write on the destination introduces a data race for bytes that are not accessed in the original program, so that would be okay only under -fallow-store-data-races?