https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96744
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> --- Created attachment 49112 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49112&action=edit Retune mask <-> general moves cost It looks to me that mask <-> general cost is too low, so the compiler now prefers these moves too much. Attached patch equalizes mask <-> general cost with xmm <-> general cost, and it seems to fix the problem. Hongjiu, can you please retune the costs, using the attached patch as the start?