On 10/03/2011 11:40 AM, Artem Shinkarov wrote: > Currently if vec_perm_ok returns false, we do not try to use a new > vshuffle routine. Would it make sense to implement that? The only > potential problem I can see is a possible performance degradation. > This leads us to the second issue.
Implement that where? In the vectorizer? No, I don't think so. The _ok routine, while also indicating what the backend expander supports, could also be thought of as a cost cutoff predicate. Unless the vectorization folk request some more exact cost metric I don't see any reason to change this. > When we perform vshuffle, we need to know whether it make sense to use > pshufb (in case of x86) or to perform data movement via standard > non-simd registers. Do we have this information in the current > cost-model? Not really. Again, if you're talking about the vectorizer, it gets even more complicated than this because... > Also, in certain cases, when the mask is constant, I would > assume the memory movement is also faster. For example if the mask is > {4,5,6,7,0,1,2,3...}, then two integer moves should do a better job. ... even before SSSE3 PSHUFB, we have all sorts of insns that can perform a constant shuffle without having to resort to either general-purpose registers or memory. E.g. PSHUFD. For specific data types, we can handle arbitrary constant shuffle with 1 or 2 insns, even when arbitrary variable shuffles aren't. It's certainly something that we could add to tree-vect-generic.c. I have no plans to do anything of the sort, however. r~