On Wed, Mar 26, 2014 at 3:50 PM, Michael Meissner <meiss...@linux.vnet.ibm.com> wrote: > This patch adds support for adding a builtin to generate the vbpermq > instruction on ISA 2.07. This instruction takes a vector in the Altivec > register set, and returns a 64-bit value in the upper part of the register, > and > 0 in the lower part of the register. > > The output is explicitly a vector, since the documentation for the instruction > says that to do a permutation of all 8 bits, you need to do 2 vbpermq's, one > with the high bit in each byte within the vector set, and the other with the > high bit cleared. > > vbpermq v6,v1,v2 # select from high-order half of Q > vxor v0,v1,v4 # adjust index values > vbpermq v5,v0,v3 # select from low-order half of Q > vor v6,v6,v5 # merge the two selections > > In writing the tests, I noticed that the vec_extract code did not have > optimizations for getting 64-bit data out, of the vector element happens to be > 0 on big endian systems, and 1 on little endian systems. So I added > optimizations for register/register move, including using the mfvsrd > instruction to transfer the final result to a GPR. While I was there, I added > vec_extract optimizations to do a 64-bit store and I combined the big endian > and little endian vec_extract load optimizaton. > > I built a big endian Spec 2006 suite with this compiler, and compared it to > the > trunk compiler without the changes. Only 3 benchmarks (gamess, dealII, and > povray) generated vec_extracts that became moves instead of permutes. I ran > the tests on a power7 system, and the differences in run time were in the > noise > level. None of the spec benchmarks generated vec_extract that was a load or a > store. > > I did bootstraps on a big endian power7 system, a big endian power8 system, > and > a little endian power8 system with no regressions in the test suite. Are > these > patches ok to install on both the trunk? I would like to apply these patches > there as well, when all of the ISA 2.07 changes are present in the 4.8 branch, > Can I apply these patches? > > [gcc] > 2014-03-26 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * config/rs6000/constraints.md (wD constraint): New constraint to > match the constant integer to get the top DImode/DFmode out of a > vector in a VSX register. > > * config/rs6000/predicates.md (vsx_scalar_64bit): New predicate to > match the constant integer to get the top DImode/DFmode out of a > vector in a VSX register. > > * config/rs6000/rs6000-builtins.def (VBPERMQ): Add vbpermq builtin > for ISA 2.07. > > * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add > vbpermq builtins. > > * config/rs6000/rs6000.c (rs6000_debug_reg_global): If > -mdebug=reg, print value of VECTOR_ELEMENT_SCALAR_64BIT. > > * config/rs6000/vsx.md (vsx_extract_<mode>, V2DI/V2DF modes): > Optimize vec_extract of 64-bit values, where the value being > extracted is in the top word, where we can use scalar > instructions. Add direct move and store support. Combine the big > endian/little endian vector select load support into a single > insn. > (vsx_extract_<mode>_internal1): Likewise. > (vsx_extract_<mode>_internal2): Likewise. > (vsx_extract_<mode>_load): Likewise. > (vsx_extract_<mode>_store): Likewise. > (vsx_extract_<mode>_zero): Delete, big and little endian insns are > combined into vsx_extract_<mode>_load. > (vsx_extract_<mode>_one_le): Likewise. > > * config/rs6000/rs6000.h (VECTOR_ELEMENT_SCALAR_64BIT): Macro to > define the top 64-bit vector element. > > * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wD > constraint. > > [gcc/testsuite] > 2014-03-26 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * gcc.target/powerpc/p8vector-vbpermq.c: New test to test the > vbpermq builtin. > > * gcc.target/powerpc/vsx-extract-1.c: New test to test VSX > vec_select optimizations. > * gcc.target/powerpc/vsx-extract-2.c: Likewise. > * gcc.target/powerpc/vsx-extract-3.c: Likewise.
Okay. Good to add the optimizations. I notice that you emit nop with a comment after a "#" character. I notice that you also added that to the POWER8 vector fusion peepholes. Is it safe to assume that all assemblers for PowerPC will consider all characters after a "#" to be a comment? I would like to make sure there are no other problems with the patch before backporting to 4.8. It wasn't included in the group of patches for 4.8 that have been widely tested. Thanks, David