Hi,
my core tuning patch has caused regression gcc.target/i386/pr61403.c which I 
have
missed in my testing.  The testcase looks for blend instruction which is no 
longer
output.  The reason is that the loop is now vectorized with SLP while before my
changes the costmodel claimed SLP vectorization is not good and vectorizer 
disabled it.

I have tested that on skylake, the new code is about 11% faster. The PR itself
is only about vectorizing the loop.  I am not quite sure what was the intention 
of the testcase, but perhaps we can just check that there is vectorized sqrt
that is output in any case?

Honza

Index: ../../gcc/testsuite/gcc.target/i386/pr61403.c
===================================================================
--- ../../gcc/testsuite/gcc.target/i386/pr61403.c       (revision 253935)
+++ ../../gcc/testsuite/gcc.target/i386/pr61403.c       (working copy)
@@ -23,4 +23,4 @@ norm (struct XYZ *in, struct XYZ *out, i
     }
 }

-/* { dg-final { scan-assembler "blend" } } */
+/* { dg-final { scan-assembler "rsqrtps" } } */

Reply via email to