The first problem is that the generic lowering to scalar implementation didn't match the documentation in that the shuffle indicies are to be masked to the range of the inputs. Or, perhaps more exactly, the generic lowering didn't match the SSE implementation which does do the masking. The OpenCL spec from whence this feature is taken does explicitly say that the masking should happen; our current documentation in extend.texi isn't quite as clear on this point.
The second problem is that the SSE implementation was broken. It simply didn't work much of the time. This was masked by... The third problem is that the test cases weren't testing what they were intended to test. In particular, with optimization the compiler was able to constant-propagate away many of the shuffling and tests. Which might be an interesting missed-optimization test, had they actually been constructed differently. But what actually needed testing first is that the operation actually works at all. Tested on x86_64 with check-gcc//unix/{,-mssse3,-msse4} Hopefully one of the AMD guys can test on a bulldozer with -mxop? Committed. r~ Richard Henderson (3): Fix lower_vec_shuffle. i386: Rewrite ix86_expand_vshuffle. Fix vect-shuffle-* test cases. gcc/ChangeLog | 14 + gcc/config/i386/i386-protos.h | 2 +- gcc/config/i386/i386.c | 208 +++++++-------- gcc/config/i386/sse.md | 4 +- gcc/testsuite/ChangeLog | 11 + .../gcc.c-torture/execute/vect-shuffle-1.c | 98 +++++--- .../gcc.c-torture/execute/vect-shuffle-2.c | 96 +++++--- .../gcc.c-torture/execute/vect-shuffle-3.c | 90 ++++--- .../gcc.c-torture/execute/vect-shuffle-4.c | 99 ++++---- .../gcc.c-torture/execute/vect-shuffle-5.c | 113 ++++---- .../gcc.c-torture/execute/vect-shuffle-6.c | 64 +++++ .../gcc.c-torture/execute/vect-shuffle-7.c | 70 +++++ .../gcc.c-torture/execute/vect-shuffle-8.c | 55 ++++ gcc/tree-vect-generic.c | 276 +++++++------------- 14 files changed, 698 insertions(+), 502 deletions(-) create mode 100644 gcc/testsuite/gcc.c-torture/execute/vect-shuffle-6.c create mode 100644 gcc/testsuite/gcc.c-torture/execute/vect-shuffle-7.c create mode 100644 gcc/testsuite/gcc.c-torture/execute/vect-shuffle-8.c -- 1.7.6.4