When only a loop mask is to be supplied for the inbranch arg to a
simd function we fail to handle integer mode masks correctly.  We
need to guess the number of elements represented by it.  This assumes
that excess arguments are all for masks, I wasn't able to create
a simdclone with more than one integer mode mask argument.

The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

        PR tree-optimization/115867
        * tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly
        guess the number of mask elements for integer mode masks.
---
 gcc/tree-vect-stmts.cc | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 5c9f2329ad3..712399e46a3 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4753,7 +4753,12 @@ vectorizable_simd_clone_call (vec_info *vinfo, 
stmt_vec_info stmt_info,
                      SIMD_CLONE_ARG_TYPE_MASK);
 
          tree masktype = bestn->simdclone->args[mask_i].vector_type;
-         callee_nelements = TYPE_VECTOR_SUBPARTS (masktype);
+         if (SCALAR_INT_MODE_P (bestn->simdclone->mask_mode))
+           /* Guess the number of lanes represented by masktype.  */
+           callee_nelements = exact_div (bestn->simdclone->simdlen,
+                                         bestn->simdclone->nargs - nargs);
+         else
+           callee_nelements = TYPE_VECTOR_SUBPARTS (masktype);
          o = vector_unroll_factor (nunits, callee_nelements);
          for (m = j * o; m < (j + 1) * o; m++)
            {
-- 
2.35.3

Reply via email to