On Thu, May 19, 2016 at 10:33:41AM -0500, Segher Boessenkool wrote: > On Thu, May 19, 2016 at 10:53:41AM -0400, Michael Meissner wrote: > > GCC 6.1 added support for the XXPERM instruction for the PowerPC ISA 3.0. > > The > > XXPERM instruction is essentially a 4 operand instruction, with only 3 > > operands > > in the instruction (the target register overlaps with the first input > > register). The Power9 hardware has fusion support where if the instruction > > that precedes the XXPERM is a XXLOR move instruction to set the first input > > argument, it is fused with the XXPERM. I added code to support this fusion. > > > > Unfortunately, in running the testsuite on the power9 simulator, we > > discovered > > that the test gcc.c-torture/execute/pr56866.c would fail because the fusion > > alternatives confused the register allocator and/or the passes after the > > register allocator. This patch removes the explicit fusion support from > > XXPERM. > > Okay. Please keep the PR open until that problem is fixed. It also > shouldn't be "target" category, if the problem is RA. > > > In addition, ISA 3.0 added XXPERMR and VPERMR instructions for little endian > > support where the permute vector reverses the bytes. This patch adds > > support > > for XXPERMR/VPERMR. > > Please send that as a separate patch, it has nothing to do with the PR. > > > + x = gen_rtx_UNSPEC (mode, > > + gen_rtvec (3, target, reg, > > Trailing space. > > > + if (TARGET_P9_VECTOR) > > + { > > + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel), > > And another. > > > + The VNAND is preferred for future fusion opportunities. */ > > + notx = gen_rtx_NOT (V16QImode, sel); > > + iorx = (TARGET_P8_VECTOR > > + ? gen_rtx_IOR (V16QImode, notx, notx) > > + : gen_rtx_AND (V16QImode, notx, notx)); > > + emit_insn (gen_rtx_SET (norreg, iorx)); > > + > > Some more. > > > +/* { dg-final { scan-assembler "vpermr\|xxpermr" } } */ > > Tab in the middle of the line.
Here are the patches for xxpermr/vpermr support that are broken out from fixing the xxperm fusion bug. I have built a compiler with these patches (and the xxperm patches) and it bootstraps and does not cause a regression. Are they ok to add to GCC 7 and eventually to GCC 6.2? [gcc] 2016-05-23 Michael Meissner <meiss...@linux.vnet.ibm.com> Kelvin Nilsen <kel...@gcc.gnu.org> * config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate vpermr/xxpermr on ISA 3.0. (altivec_expand_vec_perm_le): Likewise. * config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec. (altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for ISA 3.0. [gcc/testsuite] 2016-05-23 Michael Meissner <meiss...@linux.vnet.ibm.com> Kelvin Nilsen <kel...@gcc.gnu.org> * gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr support. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 236608) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6863,21 +6863,29 @@ rs6000_expand_vector_set (rtx target, rt gen_rtvec (3, target, reg, force_reg (V16QImode, x)), UNSPEC_VPERM); - else + else { - /* Invert selector. We prefer to generate VNAND on P8 so - that future fusion opportunities can kick in, but must - generate VNOR elsewhere. */ - rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x)); - rtx iorx = (TARGET_P8_VECTOR - ? gen_rtx_IOR (V16QImode, notx, notx) - : gen_rtx_AND (V16QImode, notx, notx)); - rtx tmp = gen_reg_rtx (V16QImode); - emit_insn (gen_rtx_SET (tmp, iorx)); - - /* Permute with operands reversed and adjusted selector. */ - x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp), - UNSPEC_VPERM); + if (TARGET_P9_VECTOR) + x = gen_rtx_UNSPEC (mode, + gen_rtvec (3, target, reg, + force_reg (V16QImode, x)), + UNSPEC_VPERMR); + else + { + /* Invert selector. We prefer to generate VNAND on P8 so + that future fusion opportunities can kick in, but must + generate VNOR elsewhere. */ + rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x)); + rtx iorx = (TARGET_P8_VECTOR + ? gen_rtx_IOR (V16QImode, notx, notx) + : gen_rtx_AND (V16QImode, notx, notx)); + rtx tmp = gen_reg_rtx (V16QImode); + emit_insn (gen_rtx_SET (tmp, iorx)); + + /* Permute with operands reversed and adjusted selector. */ + x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp), + UNSPEC_VPERM); + } } emit_insn (gen_rtx_SET (target, x)); @@ -34365,17 +34373,25 @@ altivec_expand_vec_perm_le (rtx operands if (!REG_P (target)) tmp = gen_reg_rtx (mode); - /* Invert the selector with a VNAND if available, else a VNOR. - The VNAND is preferred for future fusion opportunities. */ - notx = gen_rtx_NOT (V16QImode, sel); - iorx = (TARGET_P8_VECTOR - ? gen_rtx_IOR (V16QImode, notx, notx) - : gen_rtx_AND (V16QImode, notx, notx)); - emit_insn (gen_rtx_SET (norreg, iorx)); + if (TARGET_P9_VECTOR) + { + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel), + UNSPEC_VPERMR); + } + else + { + /* Invert the selector with a VNAND if available, else a VNOR. + The VNAND is preferred for future fusion opportunities. */ + notx = gen_rtx_NOT (V16QImode, sel); + iorx = (TARGET_P8_VECTOR + ? gen_rtx_IOR (V16QImode, notx, notx) + : gen_rtx_AND (V16QImode, notx, notx)); + emit_insn (gen_rtx_SET (norreg, iorx)); - /* Permute with operands reversed and adjusted selector. */ - unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg), - UNSPEC_VPERM); + /* Permute with operands reversed and adjusted selector. */ + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg), + UNSPEC_VPERM); + } /* Copy into target, possibly by way of a register. */ if (!REG_P (target)) Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 236608) +++ gcc/config/rs6000/altivec.md (working copy) @@ -58,6 +58,7 @@ (define_c_enum "unspec" UNSPEC_VSUM2SWS UNSPEC_VSUMSWS UNSPEC_VPERM + UNSPEC_VPERMR UNSPEC_VPERM_UNS UNSPEC_VRFIN UNSPEC_VCFUX @@ -2032,6 +2033,19 @@ (define_expand "vec_perm_constv16qi" FAIL; }) +(define_insn "*altivec_vpermr_<mode>_internal" + [(set (match_operand:VM 0 "register_operand" "=v,?wo") + (unspec:VM [(match_operand:VM 1 "register_operand" "v,0") + (match_operand:VM 2 "register_operand" "v,wo") + (match_operand:V16QI 3 "register_operand" "v,wo")] + UNSPEC_VPERMR))] + "TARGET_P9_VECTOR" + "@ + vpermr %0,%1,%2,%3 + xxpermr %x0,%x2,%x3" + [(set_attr "type" "vecperm") + (set_attr "length" "4")]) + (define_insn "altivec_vrfip" ; ceil [(set (match_operand:V4SF 0 "register_operand" "=v") (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")]