http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607
--- Comment #6 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-18 18:58:44 UTC --- Created attachment 26912 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26912 generic shuffle of a single v8sf An additional function (I should find better names...) to handle generic shuffles of a single v8sf in 4 instructions. Only tested on {6,2,3,3,5,2,3,7}. By the way, expand_vec_perm_vperm2f128_vblend2 does vpermilpd+vperm2f128 in this order, but it would be better to do it in the reverse order (adapting the mask), because it is common to need several __builtin_shuffle(x,*) and the vperm2f128 can then be shared. I also noticed while experimenting that -mavx2 generates vpermd instead of vpermps (the vpermq->vpermpd change didn't affect that).