"Yangfei (Felix)" <[email protected]> writes:
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> index e050db1a2e4..ea39fcac0e0 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> @@ -1,6 +1,7 @@
> /* { dg-do compile } */
> /* { dg-additional-options "-O3" } */
> /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */
> +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" {
> target aarch64*-*-* } } */
>
> typedef struct {
> unsigned short mprr_2[5][16][16];
This test is useful for Advanced SIMD too, so I think we should continue
to test it with whatever options the person running the testsuite chose.
Instead we could duplicate the test in gcc.target/aarch64/sve with
appropriate options.
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index eb8288e7a85..b30a7d8a3bb 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
> {
> poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
> ? vf * DR_GROUP_SIZE (stmt_info) :
> vf);
> - possible_npeel_number
> - = vect_get_num_vectors (nscalars, vectype);
> + if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
> + possible_npeel_number = 0;
> + else
> + possible_npeel_number
> + = vect_get_num_vectors (nscalars, vectype);
>
> /* NPEEL_TMP is 0 when there is no misalignment, but also
> allow peeling NELEMENTS. */
OK, so this is coming from:
int s[16][2];
…
… =s[j][1];
and an SLP node containing 16 instances of “s[j][1]”. The DR_GROUP_SIZE
is 2 because that's the inner dimension of “s”.
I don't think maybe_lt is right here though. The same problem could in
principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
e.g. for different inner dimensions of “s”.
I think the testcase shows that using DR_GROUP_SIZE in this calculation
is flawed. I'm not sure whether we can really do better given the current
representation though. This is one case where having a separate dr_vec_info
per SLP node would help.
Maybe one option (for now) would be to use:
if (multiple_p (nscalars, TREE_VECTOR_SUBPARTS (vectype)))
possible_npeel_number = vect_get_num_vectors (nscalars, vectype);
else
/* This isn't a simple stream of contiguous vector accesses. It's hard
to predict from the available information how many vector accesses
we'll actually need per iteration, so be conservative and assume
one. */
possible_npeel_number = 1;
BTW, I'm not sure whether the current choice of STMT_SLP_TYPE (stmt_info)
instead of PURE_SLP_STMT (stmt_info) is optimal or not. It means that for
hybrid SLP we base the peeling on the SLP stmt rather than the non-SLP stmt.
I guess hybrid SLP is going away soon though, so let's not worry about
that. :-)
Maybe Richard has a better suggestion.
Thanks,
Richard