> > I can imagine having some sort of target hook that computed a cost > > metric for a given constant permutation pattern. For instance, I'd > > imagine that the interleave patterns are half as expensive as a full > > permute for altivec, due to not having to load a mask. This hook would > > be fairly complicated for x86, given all of the permuting insns that > > were incrementally added in various ISA revisions, but such is life. > > There should be some way to account for the difference between the cost > in straight-line code, where a mask load is a hard cost, a large loop, > where the load can be hoisted at the cost of some target-dependent > register pressure (e.g. being able to use inverted masks might save half > of the cost), and a tight loop, where the constant load can be easily > amortized over the entire loop.
Vectorizer cost model already does that. AFAIU, vectorizer cost model will call the cost model hook to get a cost of a permute, and then incorporate that cost into the general loop/basic block vectorization cost. Ira