On Wed, Oct 30, 2013 at 6:55 PM, Bill Schmidt <wschm...@linux.vnet.ibm.com> wrote: > Hi, > > When working around the peculiar little-endian semantics of the vperm > instruction, our usual fix is to complement the permute control vector > and swap the order of the two vector input operands, so that we get a > double-wide vector in the proper order. We don't want to swap the > operands when we are expanding a mult_highpart operation, however, as > the two input operands are not to be interpreted as a double-wide > vector. Instead they represent odd and even elements, and swapping the > operands gets the odd and even elements reversed in the final result. > > The permute for this case is generated by target-neutral code in > optabs.c: expand_mult_highpart (). We obviously can't change that code > directly. However, we can redirect the logic from the "case 2" method > to target-specific code by implementing expansions for the > umul<mode>3_highpart and smul<mode>3_highpart operations. I've done > this, with the expansions acting exactly as expand_mult_highpart does > today, with the exception that it swaps the input operands to the call > to expand_vec_perm when we are generating little-endian code. We will > later swap them back to their original position in the code in rs6000.c: > altivec_expand_vec_perm_const_le (). > > The change has no intended effect when generating big-endian code. > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new > regressions. This fixes the gcc.dg/vect/pr51581-4.c test failure for > little endian. Ok for trunk? > > Thanks, > Bill > > > 2013-10-30 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > * config/rs6000/rs6000-protos.h (altivec_expand_mul_highpart): New > prototype. > * config/rs6000/rs6000.c (altivec_expand_mul_highpart): New. > * config/rs6000/altivec.md (umul<mode>3_highpart): New. > (smul_<mode>3_highpart): New.
I really do not like duplicating this code. I think that you need to explore with the community the possibility of including a hook in the general code to handle the strangeness of PPC LE vector semantics. This is asking for problems if the generic code is updated / modified / fixed. - David