On Wed, Nov 20, 2013 at 2:00 PM, Bill Schmidt <wschm...@linux.vnet.ibm.com> wrote: > Hi, > > This patch corrects the various vsx_set_* and vsx_extract_* patterns to > work correctly with little endian. For the most part this requires the > usual "subtract from N-1" modification, where N is the number of > elements. > > Extracting element zero for big endian V2DI or V2DF mode is optimized > using the scalar register equivalence. Since we can similarly optimize > extraction of element one for big endian V2DI or V2DF mode, I added a > variant that does this. I am not sure how useful this is, and we can > remove it if you like. > > The existing testcase gcc.target/powerpc/pr48258-1.c fails when counting > the number of occurrences of xxsldwi. It expects to see 6, but we > generate 9 of them for LE. This is because there are three extracts of > element zero of a V4SF in the testcase. The scalar equivalence allows > us to avoid the xxsldwi in BE but not in LE. Therefore I've disabled > this test for little endian. > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no > regressions. Is this ok for trunk? > > Thanks, > Bill > > > gcc: > > 2013-11-20 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > * config/rs6000/vsx.md (vsx_set_<mode>): Adjust for little endian. > (vsx_extract_<mode>): Likewise. > (*vsx_extract_<mode>_one_le): New LE variant on > *vsx_extract_<mode>_zero. > (vsx_extract_v4sf): Adjust for little endian. > > > gcc/testsuite: > > 2013-11-20 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > * gcc.target/powerpc/pr48258-1.c: Skip for little endian.
Okay. And thanks for the optimization to extract element one for LE. Thanks, David