On Thu, Dec 01, 2022 at 01:35:38PM +0000, Andrew Stubbs wrote:
> > Maybe better add -ffat-lto-objects to dg-additional-options and drop
> > the dg-skip-if (if it works with that, for all similar tests)?
> 
> The tests are already run with -ffat-lto-objects and the test still fails
> (pattern found zero times). I don't know why.
> 
> Aside from that, I've made all the other changes you requested.

Ah, I see what's going on.  You match simdclone, which isn't matched just in
the calls (I bet that is what you actually should/want count), but also twice
per each simd clone definition (and if somebody has say path to gcc
tree with simdclone in the name could match even more times).

Thus, I think:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16.c
> @@ -0,0 +1,89 @@
> +/* { dg-require-effective-target vect_simd_clones } */
> +/* { dg-additional-options "-fopenmp-simd -fdump-tree-optimized" } */
> +/* { dg-additional-options "-mavx" { target avx_runtime } } */
> +
> +/* Test that simd inbranch clones work correctly.  */
> +
> +#ifndef TYPE
> +#define TYPE int
> +#endif
> +
> +/* A simple function that will be cloned.  */
> +#pragma omp declare simd
> +TYPE __attribute__((noinline))
> +foo (TYPE a)
> +{
> +  return a + 1;
> +}
> +
> +/* Check that "inbranch" clones are called correctly.  */
> +
> +void __attribute__((noinline))

You should use noipa attribute instead of noinline on callers
which aren't declare simd (on declare simd it would prevent cloning
which is essential for the declare simd behavior), so that you don't
get surprises e.g. from extra ipa cp etc.

> +masked (TYPE * __restrict a, TYPE * __restrict b, int size)
> +{
> +  #pragma omp simd
> +  for (int i = 0; i < size; i++)
> +    b[i] = a[i]<1 ? foo(a[i]) : a[i];
> +}
> +
> +/* Check that "inbranch" works when there might be unrolling.  */
> +
> +void __attribute__((noinline))

So here too.
> +masked_fixed (TYPE * __restrict a, TYPE * __restrict b)

> +/* Ensure the the in-branch simd clones are used on targets that support
> +   them.  These counts include all call and definitions.  */
> +
> +/* { dg-skip-if "" { x86_64-*-* } { "-flto" } { "" } } */

Drop lines line above.

> +/* { dg-final { scan-tree-dump-times "simdclone" 18 "optimized" { target 
> x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "simdclone" 7 "optimized" { target 
> amdgcn-*-* } } } */

And scan-tree-dump-times " = foo.simdclone" 2 "optimized"; I'd think that
should be the right number for all of x86_64, amdgcn and aarch64.  And
please don't forget about i?86-*-* too.

> +/* TODO: aarch64 */

For aarch64, one would need to include it in 
check_effective_target_vect_simd_clones
first...

Otherwise LGTM.

        Jakub

Reply via email to