https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118529

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.

#3  0x0000000001acc725 in vectorizable_store (vinfo=0x356c2a0, 
    stmt_info=0x36f91c0, gsi=0x7fffffffce70, vec_stmt=0x7fffffffcd28, 
    slp_node=0x36220f0, cost_vec=0x0)
    at /home/rguenther/src/gcc/gcc/tree-vect-stmts.cc:9734
9734                    vec_oprnd = vec_oprnds[i];
(gdb) p vec_num
$1 = 4
(gdb) p vec_oprnds.m_vec->m_vecpfx
$3 = {m_alloc = 4, m_using_auto_storage = 0, m_num = 1}
(gdb) p debug_generic_expr (vectype)
vector(2) int
(gdb) p memory_access_type
$5 = VMAT_CONTIGUOUS
(gdb) p debug (slp_node)
t.c:4:6: note: node 0x36220f0 (max_nunits=8, refcnt=1) vector(2) int
t.c:4:6: note: op template: MEM[(int *)&d] = _28;
t.c:4:6: note:  stmt 0 MEM[(int *)&d] = _28;
t.c:4:6: note:  stmt 1 MEM[(int *)&d + 4B] = _40;
t.c:4:6: note:  stmt 2 MEM[(int *)&d + 8B] = _52;
t.c:4:6: note:  stmt 3 MEM[(int *)&d + 12B] = _64;
t.c:4:6: note:  stmt 4 MEM[(int *)&d + 16B] = _76;
t.c:4:6: note:  stmt 5 MEM[(int *)&d + 20B] = _88;
t.c:4:6: note:  stmt 6 MEM[(int *)&d + 24B] = _100;
t.c:4:6: note:  stmt 7 MEM[(int *)&d + 28B] = _112;
t.c:4:6: note:  children 0x3622180

t.c:4:6: note: node 0x3622180 (max_nunits=8, refcnt=1) vector(2) int
t.c:4:6: note: op template: patt_9 = patt_15 ? 0 : _20;
t.c:4:6: note:  stmt 0 patt_9 = patt_15 ? 0 : _20;
t.c:4:6: note:  stmt 1 patt_21 = patt_12 ? 1 : _38;
t.c:4:6: note:  stmt 2 patt_31 = patt_30 ? 2 : _50;
t.c:4:6: note:  stmt 3 patt_36 = patt_35 ? 3 : _62;
t.c:4:6: note:  stmt 4 patt_39 = patt_37 ? 4 : _74;
t.c:4:6: note:  stmt 5 patt_43 = patt_42 ? 5 : _86;
t.c:4:6: note:  stmt 6 patt_48 = patt_47 ? 6 : _98;
t.c:4:6: note:  stmt 7 patt_51 = patt_49 ? 7 : _110;
t.c:4:6: note:  children 0x3622210 0x36223c0 0x3622450

and

t.c:4:6: note: node 0x3622210 (max_nunits=8, refcnt=1) vector(8)
<signed-boolean:1>
t.c:4:6: note: op template: patt_15 = _24 != 0;
t.c:4:6: note:  stmt 0 patt_15 = _24 != 0;
t.c:4:6: note:  stmt 1 patt_12 = _24 != 0;
t.c:4:6: note:  stmt 2 patt_30 = _24 != 0;
t.c:4:6: note:  stmt 3 patt_35 = _24 != 0;
t.c:4:6: note:  stmt 4 patt_37 = _24 != 0;
t.c:4:6: note:  stmt 5 patt_42 = _24 != 0;
t.c:4:6: note:  stmt 6 patt_47 = _24 != 0;
t.c:4:6: note:  stmt 7 patt_49 = _24 != 0;
t.c:4:6: note:  children 0x36222a0 0x3622330

has only 1 vector def.  Note how it has a vector(8) <signed-boolean:1>
vector type, one that for sure does not inter-operate with v2si.
That boolean vector has DImode.

Oddly enough SPARC does

/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */

static opt_machine_mode
sparc_get_mask_mode (machine_mode)
{
  return Pmode;
}

this means sparc uses 1-bit element mask with Pmode


On the vectorizer side we could deal with this by properly checking for
compatible "ncopies" in vectorizable_condition.  The following mitigates
the bad mask_mode for this case:

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index b5dd1a2e40f..833029fcb00 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -12676,8 +12676,9 @@ vectorizable_condition (vec_info *vinfo,

   masked = !COMPARISON_CLASS_P (cond_expr);
   vec_cmp_type = truth_type_for (comp_vectype);
-
-  if (vec_cmp_type == NULL_TREE)
+  if (vec_cmp_type == NULL_TREE
+      || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype),
+                  TYPE_VECTOR_SUBPARTS (vec_cmp_type)))
     return false;

   cond_code = TREE_CODE (cond_expr);


OTOH the initial choice of mask mode for the compare by the vectorizer
is a bit odd.  We get there from vect_recog_bool_pattern handling

  _28 = _24 ? 0 : _20;

which builds

  patt_15 = _24 != 0;

but assigns that a vector truth type based on '_24' rather than on '_28'
which is where it's going to be used.  Using get_mask_type_for_scalar_type
is also more likely off.

Note the pattern stmt is problematic in any case since the bool _24 will
get you an incompatible vector type as well.  So in the end the above
vectorizable_condition change is required anyway.

This is the minimal fix I'm going with at this point.

Reply via email to