The first problem is that the generic lowering to scalar implementation
didn't match the documentation in that the shuffle indicies are to be
masked to the range of the inputs.  Or, perhaps more exactly, the generic
lowering didn't match the SSE implementation which does do the masking.
The OpenCL spec from whence this feature is taken does explicitly say
that the masking should happen; our current documentation in extend.texi
isn't quite as clear on this point.

The second problem is that the SSE implementation was broken.  It simply
didn't work much of the time.  This was masked by...

The third problem is that the test cases weren't testing what they were
intended to test.  In particular, with optimization the compiler was 
able to constant-propagate away many of the shuffling and tests.  Which
might be an interesting missed-optimization test, had they actually been
constructed differently.  But what actually needed testing first is that
the operation actually works at all.

Tested on x86_64 with

  check-gcc//unix/{,-mssse3,-msse4}

Hopefully one of the AMD guys can test on a bulldozer with -mxop?

Committed.


r~


Richard Henderson (3):
  Fix lower_vec_shuffle.
  i386: Rewrite ix86_expand_vshuffle.
  Fix vect-shuffle-* test cases.

 gcc/ChangeLog                                      |   14 +
 gcc/config/i386/i386-protos.h                      |    2 +-
 gcc/config/i386/i386.c                             |  208 +++++++--------
 gcc/config/i386/sse.md                             |    4 +-
 gcc/testsuite/ChangeLog                            |   11 +
 .../gcc.c-torture/execute/vect-shuffle-1.c         |   98 +++++---
 .../gcc.c-torture/execute/vect-shuffle-2.c         |   96 +++++---
 .../gcc.c-torture/execute/vect-shuffle-3.c         |   90 ++++---
 .../gcc.c-torture/execute/vect-shuffle-4.c         |   99 ++++----
 .../gcc.c-torture/execute/vect-shuffle-5.c         |  113 ++++----
 .../gcc.c-torture/execute/vect-shuffle-6.c         |   64 +++++
 .../gcc.c-torture/execute/vect-shuffle-7.c         |   70 +++++
 .../gcc.c-torture/execute/vect-shuffle-8.c         |   55 ++++
 gcc/tree-vect-generic.c                            |  276 +++++++-------------
 14 files changed, 698 insertions(+), 502 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/vect-shuffle-6.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/vect-shuffle-7.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/vect-shuffle-8.c

-- 
1.7.6.4

Reply via email to