What is this hook supposed to do?  There is no description of its arguments.

What is the theory of operation of permute within the vectorizer? Do you actually need variable permute, or would constants be ok?

I'm contemplating adding a tree- and gimple-level VEC_PERMUTE_EXPR of the form:

  VEC_PERMUTE_EXPR (vlow, vhigh, vperm)

which would be exactly equal to

  (vec_select
    (vec_concat vlow vhigh)
    vperm)

at the rtl level. I.e. vperm is an integral vector of the same number of elements as vlow.

Truly variable permutation is something that's only supported by ppc and spu. Intel AVX has a limited variable permutation -- 64-bit or 32-bit elements can be rearranged but only within a 128-bit subvector. So if you're working with 128-bit vectors, it's fully variable, but if you're working with 256-bit vectors, it's like doing 2 128-bit permute operations in parallel. Intel before AVX has no variable permute.

HOWEVER! Most of the useful permutations that I can think of for the optimizers to generate are actually constant. And these can be implemented everywhere (with varying degrees of efficiency).

Anyway, I'm thinking that it might be better to add such a general operation instead of continuing to add things like

        VEC_EXTRACT_EVEN_EXPR,
        VEC_EXTRACT_ODD_EXPR,
        VEC_INTERLEAVE_HIGH_EXPR,
        VEC_INTERLEAVE_LOW_EXPR,

and other obvious patterns like broadcast, duplicate even to odd, duplicate odd to even, etc.

I can imagine having some sort of target hook that computed a cost metric for a given constant permutation pattern. For instance, I'd imagine that the interleave patterns are half as expensive as a full permute for altivec, due to not having to load a mask. This hook would be fairly complicated for x86, given all of the permuting insns that were incrementally added in various ISA revisions, but such is life.

In any case, would a VEC_PERMUTE_EXPR, as described above, work for the uses of builtin_vec_perm within the vectorizer at present?


r~

Reply via email to