[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037

rguenth at gcc dot gnu.org via Gcc-bugs Thu, 09 Nov 2023 01:55:35 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-11-09
                 CC|                            |rsandifo at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So in fact RVV with it's single-bit element mask and the ability to
produce it from a V64QImode unsigned LT compare (but not from V64SImode?)
is supposed to be able to handle the "AVX512" style masking as far as
checking in vect_verify_full_masking_avx512 is concerned.

What I failed to implement (and check) is that the mask types have an
integer mode, thus we run into

      if (known_eq (TYPE_VECTOR_SUBPARTS (rgm->type),
                    TYPE_VECTOR_SUBPARTS (vectype)))
        return rgm->controls[index];

      /* Split the vector if needed.  Since we are dealing with integer mode
         masks with AVX512 we can operate on the integer representation
         performing the whole vector shifting.  */
      unsigned HOST_WIDE_INT factor;
      bool ok = constant_multiple_p (TYPE_VECTOR_SUBPARTS (rgm->type),
                                     TYPE_VECTOR_SUBPARTS (vectype), &factor);
      gcc_assert (ok);
      gcc_assert (GET_MODE_CLASS (TYPE_MODE (rgm->type)) == MODE_INT);

it would be fine if we didn't need to split the 64 element mask into
two halves for a V32SImode vector op we need to mask here.

We try to look at the subset of the mask by converting it to a same
size integer type, right-rshift it, truncate and covert back to the
mask type.  That might or might not be possible with RVV masks (might
or might not be the "optimal" way to do things).

We can "fix" this by doing

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index a544bc9b059..c7a92354578 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -11034,24 +11034,24 @@ vect_get_loop_mask (loop_vec_info loop_vinfo,
       bool ok = constant_multiple_p (TYPE_VECTOR_SUBPARTS (rgm->type),
                                     TYPE_VECTOR_SUBPARTS (vectype), &factor);
       gcc_assert (ok);
-      gcc_assert (GET_MODE_CLASS (TYPE_MODE (rgm->type)) == MODE_INT);
       tree mask_type = truth_type_for (vectype);
-      gcc_assert (GET_MODE_CLASS (TYPE_MODE (mask_type)) == MODE_INT);
       unsigned vi = index / factor;
       unsigned vpart = index % factor;
       tree vec = rgm->controls[vi];
       gimple_seq seq = NULL;
       vec = gimple_build (&seq, VIEW_CONVERT_EXPR,
-                         lang_hooks.types.type_for_mode
-                               (TYPE_MODE (rgm->type), 1), vec);
+                         lang_hooks.types.type_for_size
+                           (GET_MODE_BITSIZE (TYPE_MODE (rgm->type))
+                             .to_constant (), 1), vec);
       /* For integer mode masks simply shift the right bits into position.  */
       if (vpart != 0)
        vec = gimple_build (&seq, RSHIFT_EXPR, TREE_TYPE (vec), vec,
                            build_int_cst (integer_type_node,
                                           (TYPE_VECTOR_SUBPARTS (vectype)
                                            * vpart)));
-      vec = gimple_convert (&seq, lang_hooks.types.type_for_mode
-                                   (TYPE_MODE (mask_type), 1), vec);
+      vec = gimple_convert (&seq, lang_hooks.types.type_for_size
+                                   (GET_MODE_BITSIZE (TYPE_MODE (mask_type))
+                                     .to_constant (), 1), vec);
       vec = gimple_build (&seq, VIEW_CONVERT_EXPR, mask_type, vec);
       if (seq)
        gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);

which then generates the "expected" partial vector code.  If you don't
want partial vectors for VLS modes then I guess we could also enhance
the vector_modes "iteration" to allow the target to override
--param vect-partial-vector-usage on a per-mode base.

Or I can simply not "fix" the code above but instead add an integer mode
check to vect_verify_full_masking_avx512.  But as said, in principle this
scheme works.  That fix would be

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index a544bc9b059..0b364ac1c6e 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -1462,7 +1462,10 @@ vect_verify_full_masking_avx512 (loop_vec_info
loop_vinfo
)
       if (!mask_type)
        continue;

-      if (TYPE_PRECISION (TREE_TYPE (mask_type)) != 1)
+      /* For now vect_get_loop_mask only supports integer mode masks
+        when we need to split it.  */
+      if (GET_MODE_CLASS (TYPE_MODE (mask_type)) != MODE_INT
+         || TYPE_PRECISION (TREE_TYPE (mask_type)) != 1)
        {
          ok = false;
          break;

[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037

Reply via email to