https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118529
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ebotcazou at gcc dot gnu.org --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. #3 0x0000000001acc725 in vectorizable_store (vinfo=0x356c2a0, stmt_info=0x36f91c0, gsi=0x7fffffffce70, vec_stmt=0x7fffffffcd28, slp_node=0x36220f0, cost_vec=0x0) at /home/rguenther/src/gcc/gcc/tree-vect-stmts.cc:9734 9734 vec_oprnd = vec_oprnds[i]; (gdb) p vec_num $1 = 4 (gdb) p vec_oprnds.m_vec->m_vecpfx $3 = {m_alloc = 4, m_using_auto_storage = 0, m_num = 1} (gdb) p debug_generic_expr (vectype) vector(2) int (gdb) p memory_access_type $5 = VMAT_CONTIGUOUS (gdb) p debug (slp_node) t.c:4:6: note: node 0x36220f0 (max_nunits=8, refcnt=1) vector(2) int t.c:4:6: note: op template: MEM[(int *)&d] = _28; t.c:4:6: note: stmt 0 MEM[(int *)&d] = _28; t.c:4:6: note: stmt 1 MEM[(int *)&d + 4B] = _40; t.c:4:6: note: stmt 2 MEM[(int *)&d + 8B] = _52; t.c:4:6: note: stmt 3 MEM[(int *)&d + 12B] = _64; t.c:4:6: note: stmt 4 MEM[(int *)&d + 16B] = _76; t.c:4:6: note: stmt 5 MEM[(int *)&d + 20B] = _88; t.c:4:6: note: stmt 6 MEM[(int *)&d + 24B] = _100; t.c:4:6: note: stmt 7 MEM[(int *)&d + 28B] = _112; t.c:4:6: note: children 0x3622180 t.c:4:6: note: node 0x3622180 (max_nunits=8, refcnt=1) vector(2) int t.c:4:6: note: op template: patt_9 = patt_15 ? 0 : _20; t.c:4:6: note: stmt 0 patt_9 = patt_15 ? 0 : _20; t.c:4:6: note: stmt 1 patt_21 = patt_12 ? 1 : _38; t.c:4:6: note: stmt 2 patt_31 = patt_30 ? 2 : _50; t.c:4:6: note: stmt 3 patt_36 = patt_35 ? 3 : _62; t.c:4:6: note: stmt 4 patt_39 = patt_37 ? 4 : _74; t.c:4:6: note: stmt 5 patt_43 = patt_42 ? 5 : _86; t.c:4:6: note: stmt 6 patt_48 = patt_47 ? 6 : _98; t.c:4:6: note: stmt 7 patt_51 = patt_49 ? 7 : _110; t.c:4:6: note: children 0x3622210 0x36223c0 0x3622450 and t.c:4:6: note: node 0x3622210 (max_nunits=8, refcnt=1) vector(8) <signed-boolean:1> t.c:4:6: note: op template: patt_15 = _24 != 0; t.c:4:6: note: stmt 0 patt_15 = _24 != 0; t.c:4:6: note: stmt 1 patt_12 = _24 != 0; t.c:4:6: note: stmt 2 patt_30 = _24 != 0; t.c:4:6: note: stmt 3 patt_35 = _24 != 0; t.c:4:6: note: stmt 4 patt_37 = _24 != 0; t.c:4:6: note: stmt 5 patt_42 = _24 != 0; t.c:4:6: note: stmt 6 patt_47 = _24 != 0; t.c:4:6: note: stmt 7 patt_49 = _24 != 0; t.c:4:6: note: children 0x36222a0 0x3622330 has only 1 vector def. Note how it has a vector(8) <signed-boolean:1> vector type, one that for sure does not inter-operate with v2si. That boolean vector has DImode. Oddly enough SPARC does /* Implement TARGET_VECTORIZE_GET_MASK_MODE. */ static opt_machine_mode sparc_get_mask_mode (machine_mode) { return Pmode; } this means sparc uses 1-bit element mask with Pmode On the vectorizer side we could deal with this by properly checking for compatible "ncopies" in vectorizable_condition. The following mitigates the bad mask_mode for this case: diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b5dd1a2e40f..833029fcb00 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12676,8 +12676,9 @@ vectorizable_condition (vec_info *vinfo, masked = !COMPARISON_CLASS_P (cond_expr); vec_cmp_type = truth_type_for (comp_vectype); - - if (vec_cmp_type == NULL_TREE) + if (vec_cmp_type == NULL_TREE + || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), + TYPE_VECTOR_SUBPARTS (vec_cmp_type))) return false; cond_code = TREE_CODE (cond_expr); OTOH the initial choice of mask mode for the compare by the vectorizer is a bit odd. We get there from vect_recog_bool_pattern handling _28 = _24 ? 0 : _20; which builds patt_15 = _24 != 0; but assigns that a vector truth type based on '_24' rather than on '_28' which is where it's going to be used. Using get_mask_type_for_scalar_type is also more likely off. Note the pattern stmt is problematic in any case since the bool _24 will get you an incompatible vector type as well. So in the end the above vectorizable_condition change is required anyway. This is the minimal fix I'm going with at this point.