On 11/5/24 07:39, Richard Biener wrote:
On Tue, 5 Nov 2024, Victor Do Nascimento wrote:

The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors.  This does not
work for non-constant simdclones, so we should disable using such clones when
the VF is a multiple of the non-constant simdlen until we change the codegen to
support those.

ISTR BIT_FIELD_REF now uses poly-int offset and size so what breaks
here?  I don't see any other way that such BIT_FIELD_REFs to represent
hi/lo part accesses?

Thanks for the feedback.

You're absolutely right that BIT_FIELD_REF now uses poly-int, so
the reason given for the need of the

!n->simdclone->simdlen.is_constant () && num_calls != 1

guard is no longer accurate and I missed it.

My initial investigation highlights the following:

At present, the enablement of multiple calls to simdclone leads to an
ICE in the expand phase, potentially due to inadequate handling of
BIT_FIELD_REFs which use poly ints.

Consider the function:

#pragma GCC target ("+sve")
extern char __attribute__ ((simd, const)) fn3 (int, char);
void test_fn3 (int *a, int *b, char *c, int n)
{
  for (int i = 0; i < n; ++i)
    a[i] = (int) (fn3 (b[i], c[i]) + c[i]);
}

The c[i] that serves as an argument to `fn3' is a BIT_FIELD_REF of the
c[i] in the sum, expressed as:

   BIT_FIELD_REF <vect__c_i, POLY_INT_CST [32, 32], 0>

In `get_inner_reference', the POLY_INT_CST [32, 32] `size_tree' fails
the `tree_fits_uhwi_p' test, setting mode to BLKmode.

This is currently leading to an issue in `load_register_parameters',
where the `else if (TYPE_MODE (type) == BLKmode)' branch is never
entered as TYPE_MODE (TREE_TYPE (tree_value) is E_VNx4QImode.

Consequently, a standard `emit_move_insn (reg, args[i].value)' call is
issued and a `BLKmode == E_VNx4QImode' assert fails there.

We need to ensure the appropriate path is taken in the
`load_register_parameters' call for the poly-int sized BIT_FIELD_REFs.

We are looking at remedying the issue, thus fully enabling the feature.

Many thanks,
Victor.


Richard.

gcc/ChangeLog:

        * tree-vect-stmts.cc (vectorizable_simd_clone_call): Reject simdclones
        with non-constant simdlen when VF is not exactly the same.
---
  gcc/tree-vect-stmts.cc | 5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 2d0da6f0a0e..961421fee25 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4149,7 +4149,10 @@ vectorizable_simd_clone_call (vec_info *vinfo, 
stmt_vec_info stmt_info,
        if (!constant_multiple_p (vf * group_size, n->simdclone->simdlen,
                                  &num_calls)
            || (!n->simdclone->inbranch && (masked_call_offset > 0))
-           || (nargs != simd_nargs))
+           || (nargs != simd_nargs)
+           /* Currently we do not support multiple calls of non-constant
+              simdlen as poly vectors can not be accessed by BIT_FIELD_REF.  */
+           || (!n->simdclone->simdlen.is_constant () && num_calls != 1))
          continue;
        if (num_calls != 1)
          this_badness += floor_log2 (num_calls) * 4096;


Reply via email to