http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #6 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-18 
18:58:44 UTC ---
Created attachment 26912
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26912
generic shuffle of a single v8sf

An additional function (I should find better names...) to handle generic
shuffles of a single v8sf in 4 instructions. Only tested on {6,2,3,3,5,2,3,7}.

By the way, expand_vec_perm_vperm2f128_vblend2 does vpermilpd+vperm2f128 in
this order, but it would be better to do it in the reverse order (adapting the
mask), because it is common to need several __builtin_shuffle(x,*) and the
vperm2f128 can then be shared.

I also noticed while experimenting that -mavx2 generates vpermd instead of
vpermps (the vpermq->vpermpd change didn't affect that).

Reply via email to