On 11/5/24 07:39, Richard Biener wrote:
On Tue, 5 Nov 2024, Victor Do Nascimento wrote:
The current codegen code to support VF's that are multiples of a simdclone
simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not
work for non-constant simdclones, so we should disable using such clones when
the VF is a multiple of the non-constant simdlen until we change the codegen to
support those.
ISTR BIT_FIELD_REF now uses poly-int offset and size so what breaks
here? I don't see any other way that such BIT_FIELD_REFs to represent
hi/lo part accesses?
Thanks for the feedback.
You're absolutely right that BIT_FIELD_REF now uses poly-int, so
the reason given for the need of the
!n->simdclone->simdlen.is_constant () && num_calls != 1
guard is no longer accurate and I missed it.
My initial investigation highlights the following:
At present, the enablement of multiple calls to simdclone leads to an
ICE in the expand phase, potentially due to inadequate handling of
BIT_FIELD_REFs which use poly ints.
Consider the function:
#pragma GCC target ("+sve")
extern char __attribute__ ((simd, const)) fn3 (int, char);
void test_fn3 (int *a, int *b, char *c, int n)
{
for (int i = 0; i < n; ++i)
a[i] = (int) (fn3 (b[i], c[i]) + c[i]);
}
The c[i] that serves as an argument to `fn3' is a BIT_FIELD_REF of the
c[i] in the sum, expressed as:
BIT_FIELD_REF <vect__c_i, POLY_INT_CST [32, 32], 0>
In `get_inner_reference', the POLY_INT_CST [32, 32] `size_tree' fails
the `tree_fits_uhwi_p' test, setting mode to BLKmode.
This is currently leading to an issue in `load_register_parameters',
where the `else if (TYPE_MODE (type) == BLKmode)' branch is never
entered as TYPE_MODE (TREE_TYPE (tree_value) is E_VNx4QImode.
Consequently, a standard `emit_move_insn (reg, args[i].value)' call is
issued and a `BLKmode == E_VNx4QImode' assert fails there.
We need to ensure the appropriate path is taken in the
`load_register_parameters' call for the poly-int sized BIT_FIELD_REFs.
We are looking at remedying the issue, thus fully enabling the feature.
Many thanks,
Victor.
Richard.
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Reject simdclones
with non-constant simdlen when VF is not exactly the same.
---
gcc/tree-vect-stmts.cc | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 2d0da6f0a0e..961421fee25 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4149,7 +4149,10 @@ vectorizable_simd_clone_call (vec_info *vinfo,
stmt_vec_info stmt_info,
if (!constant_multiple_p (vf * group_size, n->simdclone->simdlen,
&num_calls)
|| (!n->simdclone->inbranch && (masked_call_offset > 0))
- || (nargs != simd_nargs))
+ || (nargs != simd_nargs)
+ /* Currently we do not support multiple calls of non-constant
+ simdlen as poly vectors can not be accessed by BIT_FIELD_REF. */
+ || (!n->simdclone->simdlen.is_constant () && num_calls != 1))
continue;
if (num_calls != 1)
this_badness += floor_log2 (num_calls) * 4096;