When only a loop mask is to be supplied for the inbranch arg to a simd function we fail to handle integer mode masks correctly. We need to guess the number of elements represented by it. This assumes that excess arguments are all for masks, I wasn't able to create a simdclone with more than one integer mode mask argument.
The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. PR tree-optimization/115867 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly guess the number of mask elements for integer mode masks. --- gcc/tree-vect-stmts.cc | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 5c9f2329ad3..712399e46a3 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4753,7 +4753,12 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, SIMD_CLONE_ARG_TYPE_MASK); tree masktype = bestn->simdclone->args[mask_i].vector_type; - callee_nelements = TYPE_VECTOR_SUBPARTS (masktype); + if (SCALAR_INT_MODE_P (bestn->simdclone->mask_mode)) + /* Guess the number of lanes represented by masktype. */ + callee_nelements = exact_div (bestn->simdclone->simdlen, + bestn->simdclone->nargs - nargs); + else + callee_nelements = TYPE_VECTOR_SUBPARTS (masktype); o = vector_unroll_factor (nunits, callee_nelements); for (m = j * o; m < (j + 1) * o; m++) { -- 2.35.3