Hi, Most of our constant vector permutes use the vperm instructions, but for V2DImode and V2DFmode we use xxpermdi. This patch corrects the generated xxpermdi to be correct for little endian, which fixes failures of the test cases gcc.dg/torture/vshuf-v2d[fi].c. Note that we can't fix this directly in the pattern for xxpermdi, because that pattern is used by the corresponding intrinsic.
Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill 2013-11-22 Bill Schmidt <wschm...@linux.vnet.ibm.com> * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct for little endian. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 205243) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -30021,6 +30021,21 @@ rs6000_expand_vec_perm_const_1 (rtx target, rtx op gcc_assert (GET_MODE_NUNITS (vmode) == 2); dmode = mode_for_vector (GET_MODE_INNER (vmode), 4); + /* For little endian, swap operands and invert/swap selectors + to get the correct xxpermdi. The operand swap sets up the + inputs as a little endian array. The selectors are swapped + because they are defined to use big endian ordering. The + selectors are inverted to get the correct doublewords for + little endian ordering. */ + if (!BYTES_BIG_ENDIAN) + { + int n; + perm0 = 3 - perm0; + perm1 = 3 - perm1; + n = perm0, perm0 = perm1, perm1 = n; + x = op0, op0 = op1, op1 = x; + } + x = gen_rtx_VEC_CONCAT (dmode, op0, op1); v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1)); x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v));