On Mon, Oct 3, 2011 at 8:02 PM, Richard Henderson <r...@redhat.com> wrote: > On 10/03/2011 11:40 AM, Artem Shinkarov wrote: >> Currently if vec_perm_ok returns false, we do not try to use a new >> vshuffle routine. Would it make sense to implement that? The only >> potential problem I can see is a possible performance degradation. >> This leads us to the second issue. > > Implement that where? In the vectorizer? No, I don't think so. > The _ok routine, while also indicating what the backend expander > supports, could also be thought of as a cost cutoff predicate. > Unless the vectorization folk request some more exact cost metric > I don't see any reason to change this.
I was thinking more about the expander of the backend itself. When we throw sorry () in the ix86_expand_vec_perm_builtin, we can fall back to the vshuffle routine, unless it would lead to the performance degradation. >> When we perform vshuffle, we need to know whether it make sense to use >> pshufb (in case of x86) or to perform data movement via standard >> non-simd registers. Do we have this information in the current >> cost-model? > > Not really. Again, if you're talking about the vectorizer, it > gets even more complicated than this because... > >> Also, in certain cases, when the mask is constant, I would >> assume the memory movement is also faster. For example if the mask is >> {4,5,6,7,0,1,2,3...}, then two integer moves should do a better job. > > ... even before SSSE3 PSHUFB, we have all sorts of insns that can > perform a constant shuffle without having to resort to either > general-purpose registers or memory. E.g. PSHUFD. For specific > data types, we can handle arbitrary constant shuffle with 1 or 2 > insns, even when arbitrary variable shuffles aren't. But these cases are more or less covered. I am thinking about the cases when vec_perm_ok, returns false, but the actual permutation could be done faster with memory/register transfers, rather than with the PSHUFB & Co. > It's certainly something that we could add to tree-vect-generic.c. > I have no plans to do anything of the sort, however. I didn't quite understand what do you think can be added to the tree-vect-generic? I thought that we are talking about more or less backend issues. In any case I am investigating these problems, and I will appreciate any help or advices. Thanks, Artem.