http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607
Marc Glisse <marc.glisse at normalesup dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #26938|0 |1
is obsolete| |
--- Comment #18 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-25
13:52:09 UTC ---
Created attachment 26979
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26979
default case
An updated version of this simple, generic-case shuffle (do note that I didn't
run the generated code, just checked that it compiled and the instructions
generated looked roughly ok). With the patch, we have (concerning v4df and
v8sf):
- no single-vector shuffle takes more than 4 insn,
- no 2-vector shuffle takes more than 9 insn (or 3 (+ 2 movs for constants...)
with AVX2).
I think the current code already guarantees than anything that can be done in a
single instruction is.
Some possible goals (making everything optimal may be a bit hard) would be:
- everything that can be done in 2 insn is,
- no single-vector v4df takes more than 3 insn,
- one or two extra optimizations, if they are generic enough.
I do wonder occasionally about allowing wild indexes (jokers, places where you
can put anything) in shuffles, whether it is exposed to users or just an
internal tool.