http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse <marc.glisse at normalesup dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #26912|0                           |1
        is obsolete|                            |

--- Comment #17 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-20 
21:50:40 UTC ---
Created attachment 26938
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26938
intra-lane shuffle in 3 insn

This (mostly untested) patch is a reformulation of the generic v8sf single
vector shuffle in 4 insn as a generic intra-lane 2 vector shuffle in at most 3
insn. Reformulating __builtin_shuffle(x,m) as
__builtin_shuffle(x,vperm2f128(x,1),mm) would then guarantee a maximum size of
4.

Note that the strategy of doing a 2-vector shuffle by shuffling (not restricted
to one vpermilp*) each vector and blending the results gives a maximum of 9
insn, whereas the current code often generates twice that number.


By the way, I have trouble understanding this comment:
      /* For d->op0 == d->op1 the only useful vperm2f128 permutation
         is 0x10.  */
Is it really 0x10, or is there a stray 0 at the end and it is really just 1?

Reply via email to