Hi Honza, That should be fine unless vectorization is done using extract/insert instructions.
Thanks, Evgeny On Wed, Oct 25, 2017 at 12:25 PM, Jan Hubicka <hubi...@ucw.cz> wrote: > Hi, > my core tuning patch has caused regression gcc.target/i386/pr61403.c which I > have > missed in my testing. The testcase looks for blend instruction which is no > longer > output. The reason is that the loop is now vectorized with SLP while before > my > changes the costmodel claimed SLP vectorization is not good and vectorizer > disabled it. > > I have tested that on skylake, the new code is about 11% faster. The PR itself > is only about vectorizing the loop. I am not quite sure what was the > intention > of the testcase, but perhaps we can just check that there is vectorized sqrt > that is output in any case? > > Honza > > Index: ../../gcc/testsuite/gcc.target/i386/pr61403.c > =================================================================== > --- ../../gcc/testsuite/gcc.target/i386/pr61403.c (revision 253935) > +++ ../../gcc/testsuite/gcc.target/i386/pr61403.c (working copy) > @@ -23,4 +23,4 @@ norm (struct XYZ *in, struct XYZ *out, i > } > } > > -/* { dg-final { scan-assembler "blend" } } */ > +/* { dg-final { scan-assembler "rsqrtps" } } */ >