Hi Honza,

That should be fine unless vectorization is done using extract/insert
instructions.

Thanks,
Evgeny

On Wed, Oct 25, 2017 at 12:25 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
> Hi,
> my core tuning patch has caused regression gcc.target/i386/pr61403.c which I 
> have
> missed in my testing.  The testcase looks for blend instruction which is no 
> longer
> output.  The reason is that the loop is now vectorized with SLP while before 
> my
> changes the costmodel claimed SLP vectorization is not good and vectorizer
> disabled it.
>
> I have tested that on skylake, the new code is about 11% faster. The PR itself
> is only about vectorizing the loop.  I am not quite sure what was the 
> intention
> of the testcase, but perhaps we can just check that there is vectorized sqrt
> that is output in any case?
>
> Honza
>
> Index: ../../gcc/testsuite/gcc.target/i386/pr61403.c
> ===================================================================
> --- ../../gcc/testsuite/gcc.target/i386/pr61403.c       (revision 253935)
> +++ ../../gcc/testsuite/gcc.target/i386/pr61403.c       (working copy)
> @@ -23,4 +23,4 @@ norm (struct XYZ *in, struct XYZ *out, i
>      }
>  }
>
> -/* { dg-final { scan-assembler "blend" } } */
> +/* { dg-final { scan-assembler "rsqrtps" } } */
>

Reply via email to