On Wed, Jun 22, 2016 at 09:22:22AM -0500, Segher Boessenkool wrote: > Don't give up so easily? ;-) > > The predicate should be tightened, the expander should use a new predicate > that allows all those other things. The hardest part is figuring a good > name for it ;-)
This code should fix the problem. It does not allow constants in the arguments. Combine will create one of the vec_duplicate patterns with a constant integer that will generate VSPLTIS<x> or XXSPLTIB/etc. I also tightened the memory requirements to only allow indexed memory forms during/after register allocation, since the instruction only uses indexed addressing. I bootstrapped the compiler and ran make check with no regressions on a little endian power8 system. Can I check it into trunk, and after an appropriate waiting period check it into GCC 6.x if there were no issues? [gcc] 2016-06-22 Michael Meissner <meiss...@linux.vnet.ibm.com> Bill Schmidt <wschm...@linux.vnet.ibm.com> * config/rs6000/predicates.md (splat_input_operand): Rework. Don't allow constants, since the caller insns don't support constants. During and after register allocation, only allow indexed or indirect addresses, and not general addresses. Only allow modes supported by the hardware. * config/rs6000/rs6000.c (xxsplitb_constant_p): Update usage comment. Move check for using VSPLTIS<x> to a common location, instead of doing it in two different places. [gcc/testsuite] 2016-06-22 Michael Meissner <meiss...@linux.vnet.ibm.com> Bill Schmidt <wschm...@linux.vnet.ibm.com> * gcc.target/powerpc/p9-splat-5.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 237715) +++ gcc/config/rs6000/predicates.md (working copy) @@ -1056,27 +1056,34 @@ (define_predicate "input_operand" ;; Return 1 if this operand is a valid input for a vsx_splat insn. (define_predicate "splat_input_operand" - (match_code "symbol_ref,const,reg,subreg,mem, - const_double,const_wide_int,const_vector,const_int") + (match_code "reg,subreg,mem") { + machine_mode vmode; + + if (mode == DFmode) + vmode = V2DFmode; + else if (mode == DImode) + vmode = V2DImode; + else if (mode == SImode && TARGET_P9_VECTOR) + vmode = V4SImode; + else if (mode == SFmode && TARGET_P9_VECTOR) + vmode = V4SFmode; + else + return false; + if (MEM_P (op)) { + rtx addr = XEXP (op, 0); + if (! volatile_ok && MEM_VOLATILE_P (op)) return 0; - if (mode == DFmode) - mode = V2DFmode; - else if (mode == DImode) - mode = V2DImode; - else if (mode == SImode && TARGET_P9_VECTOR) - mode = V4SImode; - else if (mode == SFmode && TARGET_P9_VECTOR) - mode = V4SFmode; + + if (reload_in_progress || lra_in_progress || reload_completed) + return indexed_or_indirect_address (addr, vmode); else - gcc_unreachable (); - return memory_address_addr_space_p (mode, XEXP (op, 0), - MEM_ADDR_SPACE (op)); + return memory_address_addr_space_p (vmode, addr, MEM_ADDR_SPACE (op)); } - return input_operand (op, mode); + return gpc_reg_operand (op, mode); }) ;; Return true if OP is a non-immediate operand and not an invalid Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 237715) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6282,10 +6282,7 @@ gen_easy_altivec_constant (rtx op) Return the number of instructions needed (1 or 2) into the address pointed via NUM_INSNS_PTR. - If NOSPLIT_P, only return true for constants that only generate the XXSPLTIB - instruction and can go in any VSX register. If !NOSPLIT_P, only return true - for constants that generate XXSPLTIB and need a sign extend operation, which - restricts us to the Altivec registers. + Return the constant that is being split via CONSTANT_PTR. Allow either (vec_const [...]) or (vec_duplicate <const>). If OP is a valid XXSPLTIB constant, return the constant being set via the CONST_PTR @@ -6355,13 +6352,6 @@ xxspltib_constant_p (rtx op, if (value != INTVAL (element)) return false; } - - /* See if we could generate vspltisw/vspltish directly instead of - xxspltib + sign extend. Special case 0/-1 to allow getting - any VSX register instead of an Altivec register. */ - if (!IN_RANGE (value, -1, 0) && EASY_VECTOR_15 (value) - && (mode == V4SImode || mode == V8HImode)) - return false; } /* Handle integer constants being loaded into the upper part of the VSX @@ -6389,6 +6379,13 @@ xxspltib_constant_p (rtx op, else return false; + /* See if we could generate vspltisw/vspltish directly instead of xxspltib + + sign extend. Special case 0/-1 to allow getting any VSX register instead + of an Altivec register. */ + if ((mode == V4SImode || mode == V8HImode) && !IN_RANGE (value, -1, 0) + && EASY_VECTOR_15 (value)) + return false; + /* Return # of instructions and the constant byte for XXSPLTIB. */ if (mode == V16QImode) *num_insns_ptr = 1; Index: gcc/testsuite/gcc.target/powerpc/p9-splat-5.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-splat-5.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-splat-5.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2" } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-final { scan-assembler "vspltish" } } */ +/* { dg-final { scan-assembler-not "xxspltib" } } */ + +/* Make sure we don't use an inefficient sequence for small integer splat. */ + +#include <altivec.h> + +vector short +foo () +{ + return vec_splat_s16 (5); +}