> > > is the exit edge you are looking for without iterating over all loop > > > exits. > > > > > > > + gimple *tmp_vec_stmt = vec_stmt; > > > > + tree tmp_vec_lhs = vec_lhs; > > > > + tree tmp_bitstart = bitstart; > > > > + /* For early exit where the exit is not in the BB that > > > > leads > > > > + to the latch then we're restarting the iteration in > > > > the > > > > + scalar loop. So get the first live value. */ > > > > + restart_loop = restart_loop || exit_e != main_e; > > > > + if (restart_loop) > > > > + { > > > > + tmp_vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; > > > > + tmp_vec_lhs = gimple_get_lhs (tmp_vec_stmt); > > > > + tmp_bitstart = build_zero_cst (TREE_TYPE > > > > (bitstart)); > > > > > > Hmm, that gets you the value after the first iteration, not the one > > > before which > > > would be the last value of the preceeding vector iteration? > > > (but we don't keep those, we'd need a PHI) > > > > I don't fully follow. The comment on top of this hunk under if > > (loop_vinfo) states > > that lhs should be pointing to a PHI. > > > > When I inspect the statement I see > > > > i_14 = PHI <i_11(6), 0(14)> > > > > so i_14 is the value at the start of the current iteration. If we're > > coming from the > > header 0, otherwise i_11 which is the value of the previous iteration? > > > > The peeling code explicitly leaves i_14 in the merge block and not i_11 for > > this > exact reason. > > So I'm confused, my understanding is that we're already *at* the right PHI. > > > > Is it perhaps that you thought we put i_11 here for the early exits? In > > which case > > Yes I'd agree that that would be wrong, and there we would have had to look > > at > > The defs, but i_11 is the def. > > > > I already kept this in mind and leveraged peeling to make this part easier. > > i_11 is used in the main exit and i_14 in the early one. > > I think the important detail is that this code is only executed for > vect_induction_defs which are indeed PHIs and so we're sure the > value live is before any modification so fine to feed as initial > value for the PHI in the epilog. > > Maybe we can assert the def type here?
We can't assert because until cfg cleanup the dead value is still seen and still vectorized. That said I've added a guard here. We vectorize the non-induction value as normal now and if it's ever used it'll fail. > > > > > > > Why again do we need (non-induction) live values from the vector loop to > > > the > > > epilogue loop again? > > > > They can appear as the result value of the main exit. > > > > e.g. in testcase (vect-early-break_17.c) > > > > #define N 1024 > > unsigned vect_a[N]; > > unsigned vect_b[N]; > > > > unsigned test4(unsigned x) > > { > > unsigned ret = 0; > > for (int i = 0; i < N; i++) > > { > > vect_b[i] = x + i; > > if (vect_a[i] > x) > > return vect_a[i]; > > vect_a[i] = x; > > ret = vect_a[i] + vect_b[i]; > > } > > return ret; > > } > > > > The only situation they can appear in the as an early-break is when > > we have a case where main exit != latch connected exit. > > > > However in these cases they are unused, and only there because > > normally you would have exited (i.e. there was a return) but the > > vector loop needs to start over so we ignore it. > > > > These happen in testcase vect-early-break_74.c and > > vect-early-break_78.c > > Hmm, so in that case their value is incorrect (but doesn't matter, > we ignore it)? > Correct, they're placed there due to exit redirection, but in these inverted testcases where we've peeled the vector iteration you can't ever skip the epilogue. So they are guaranteed not to be used. > > > > + gimple_stmt_iterator exit_gsi; > > > > + tree new_tree > > > > + = vectorizable_live_operation_1 (loop_vinfo, > > > > stmt_info, > > > > + exit_e, vectype, > > > > ncopies, > > > > + slp_node, bitsize, > > > > + tmp_bitstart, > > > > tmp_vec_lhs, > > > > + lhs_type, > > > > restart_loop, > > > > + &exit_gsi); > > > > + > > > > + /* Use the empty block on the exit to materialize the > > > > new > > > stmts > > > > + so we can use update the PHI here. */ > > > > + if (gimple_phi_num_args (use_stmt) == 1) > > > > + { > > > > + auto gsi = gsi_for_stmt (use_stmt); > > > > + remove_phi_node (&gsi, false); > > > > + tree lhs_phi = gimple_phi_result (use_stmt); > > > > + gimple *copy = gimple_build_assign (lhs_phi, > > > > new_tree); > > > > + gsi_insert_before (&exit_gsi, copy, GSI_SAME_STMT); > > > > + } > > > > + else > > > > + SET_PHI_ARG_DEF (use_stmt, dest_e->dest_idx, > > > > new_tree); > > > > > > if the else case works, why not use it always? > > > > Because it doesn't work for main exit. The early exit have a intermediate > > block > > that is used to generate the statements on, so for them we are fine > > updating the > > use in place. > > > > The main exits don't. and so the existing trick the vectorizer uses is to > > materialize > > the statements in the same block and then dissolves the phi node. However > > you > > can't do that for the early exit because the phi node isn't singular. > > But if the PHI has a single arg you can replace that? By making a > copy stmt from it don't you break LC SSA? > Yeah, what the existing code is sneakily doing is this: It has to vectorize x = PHI <y> y gets vectorized a z but x = PHI <z> z = ... would be invalid, so what it does, since it doesn't have a predecessor note to place stuff in, it'll do z = ... x = z and removed the PHI. The PHI was only placed there for vectorization so it's not needed after this point. It's also for this reason why the code passes around a gimpe_seq since it needs to make sure it gets the order right when inserting statements. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-loop.cc (vectorizable_live_operation, vectorizable_live_operation_1): Support early exits. (can_vectorize_live_stmts): Call vectorizable_live_operation for non-live inductions or reductions. (find_connected_edge, vect_get_vect_def): New. (vect_create_epilog_for_reduction): Support reductions in early break. * tree-vect-stmts.cc (perm_mask_for_reverse): Expose. (vect_stmt_relevant_p): Mark all inductions when early break as being live. * tree-vectorizer.h (perm_mask_for_reverse): Expose. --- inline copy of patch --- diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index f38cc47551488525b15c2be758cac8291dbefb3a..4e48217a31e59318c2ea8e5ab63b06ba19840cbd 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -3346,6 +3346,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, bb_before_epilog->count = single_pred_edge (bb_before_epilog)->count (); bb_before_epilog = loop_preheader_edge (epilog)->src; } + /* If loop is peeled for non-zero constant times, now niters refers to orig_niters - prolog_peeling, it won't overflow even the orig_niters overflows. */ diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index df5e1d28fac2ce35e71decdec0d8e31fb75557f5..2f922b42f6d567dfd5da9b276b1c9d37bc681876 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -5831,6 +5831,34 @@ vect_create_partial_epilog (tree vec_def, tree vectype, code_helper code, return new_temp; } +/* Retrieves the definining statement to be used for a reduction. + For MAIN_EXIT_P we use the current VEC_STMTs and otherwise we look at + the reduction definitions. */ + +tree +vect_get_vect_def (stmt_vec_info reduc_info, slp_tree slp_node, + slp_instance slp_node_instance, bool main_exit_p, unsigned i, + vec <gimple *> &vec_stmts) +{ + tree def; + + if (slp_node) + { + if (!main_exit_p) + slp_node = slp_node_instance->reduc_phis; + def = vect_get_slp_vect_def (slp_node, i); + } + else + { + if (!main_exit_p) + reduc_info = STMT_VINFO_REDUC_DEF (vect_orig_stmt (reduc_info)); + vec_stmts = STMT_VINFO_VEC_STMTS (reduc_info); + def = gimple_get_lhs (vec_stmts[0]); + } + + return def; +} + /* Function vect_create_epilog_for_reduction Create code at the loop-epilog to finalize the result of a reduction @@ -5842,6 +5870,8 @@ vect_create_partial_epilog (tree vec_def, tree vectype, code_helper code, SLP_NODE_INSTANCE is the SLP node instance containing SLP_NODE REDUC_INDEX says which rhs operand of the STMT_INFO is the reduction phi (counting from 0) + LOOP_EXIT is the edge to update in the merge block. In the case of a single + exit this edge is always the main loop exit. This function: 1. Completes the reduction def-use cycles. @@ -5882,7 +5912,8 @@ static void vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, stmt_vec_info stmt_info, slp_tree slp_node, - slp_instance slp_node_instance) + slp_instance slp_node_instance, + edge loop_exit) { stmt_vec_info reduc_info = info_for_reduction (loop_vinfo, stmt_info); gcc_assert (reduc_info->is_reduc_info); @@ -5891,6 +5922,7 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, loop-closed PHI of the inner loop which we remember as def for the reduction PHI generation. */ bool double_reduc = false; + bool main_exit_p = LOOP_VINFO_IV_EXIT (loop_vinfo) == loop_exit; stmt_vec_info rdef_info = stmt_info; if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def) { @@ -6053,7 +6085,7 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, /* Create an induction variable. */ gimple_stmt_iterator incr_gsi; bool insert_after; - standard_iv_increment_position (loop, &incr_gsi, &insert_after); + vect_iv_increment_position (loop_exit, &incr_gsi, &insert_after); create_iv (series_vect, PLUS_EXPR, vec_step, NULL_TREE, loop, &incr_gsi, insert_after, &indx_before_incr, &indx_after_incr); @@ -6132,23 +6164,23 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, Store them in NEW_PHIS. */ if (double_reduc) loop = outer_loop; - exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; + /* We need to reduce values in all exits. */ + exit_bb = loop_exit->dest; exit_gsi = gsi_after_labels (exit_bb); reduc_inputs.create (slp_node ? vec_num : ncopies); + vec <gimple *> vec_stmts; for (unsigned i = 0; i < vec_num; i++) { gimple_seq stmts = NULL; - if (slp_node) - def = vect_get_slp_vect_def (slp_node, i); - else - def = gimple_get_lhs (STMT_VINFO_VEC_STMTS (rdef_info)[0]); + def = vect_get_vect_def (rdef_info, slp_node, slp_node_instance, + main_exit_p, i, vec_stmts); for (j = 0; j < ncopies; j++) { tree new_def = copy_ssa_name (def); phi = create_phi_node (new_def, exit_bb); if (j) - def = gimple_get_lhs (STMT_VINFO_VEC_STMTS (rdef_info)[j]); - SET_PHI_ARG_DEF (phi, LOOP_VINFO_IV_EXIT (loop_vinfo)->dest_idx, def); + def = gimple_get_lhs (vec_stmts[j]); + SET_PHI_ARG_DEF (phi, loop_exit->dest_idx, def); new_def = gimple_convert (&stmts, vectype, new_def); reduc_inputs.quick_push (new_def); } @@ -10481,17 +10513,18 @@ vectorizable_induction (loop_vec_info loop_vinfo, return true; } - /* Function vectorizable_live_operation_1. + helper function for vectorizable_live_operation. */ + tree vectorizable_live_operation_1 (loop_vec_info loop_vinfo, - stmt_vec_info stmt_info, edge exit_e, + stmt_vec_info stmt_info, basic_block exit_bb, tree vectype, int ncopies, slp_tree slp_node, tree bitsize, tree bitstart, tree vec_lhs, - tree lhs_type, gimple_stmt_iterator *exit_gsi) + tree lhs_type, bool restart_loop, + gimple_stmt_iterator *exit_gsi) { - basic_block exit_bb = exit_e->dest; gcc_assert (single_pred_p (exit_bb) || LOOP_VINFO_EARLY_BREAKS (loop_vinfo)); tree vec_lhs_phi = copy_ssa_name (vec_lhs); @@ -10504,7 +10537,9 @@ vectorizable_live_operation_1 (loop_vec_info loop_vinfo, if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) { /* Emit: + SCALAR_RES = VEC_EXTRACT <VEC_LHS, LEN + BIAS - 1> + where VEC_LHS is the vectorized live-out result and MASK is the loop mask for the final iteration. */ gcc_assert (ncopies == 1 && !slp_node); @@ -10513,15 +10548,18 @@ vectorizable_live_operation_1 (loop_vec_info loop_vinfo, tree len = vect_get_loop_len (loop_vinfo, &gsi, &LOOP_VINFO_LENS (loop_vinfo), 1, vectype, 0, 0); + /* BIAS - 1. */ signed char biasval = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); tree bias_minus_one = int_const_binop (MINUS_EXPR, build_int_cst (TREE_TYPE (len), biasval), build_one_cst (TREE_TYPE (len))); + /* LAST_INDEX = LEN + (BIAS - 1). */ tree last_index = gimple_build (&stmts, PLUS_EXPR, TREE_TYPE (len), len, bias_minus_one); + /* This needs to implement extraction of the first index, but not sure how the LEN stuff works. At the moment we shouldn't get here since there's no LEN support for early breaks. But guard this so there's @@ -10532,13 +10570,16 @@ vectorizable_live_operation_1 (loop_vec_info loop_vinfo, tree scalar_res = gimple_build (&stmts, CFN_VEC_EXTRACT, TREE_TYPE (vectype), vec_lhs_phi, last_index); + /* Convert the extracted vector element to the scalar type. */ new_tree = gimple_convert (&stmts, lhs_type, scalar_res); } else if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) { /* Emit: + SCALAR_RES = EXTRACT_LAST <VEC_LHS, MASK> + where VEC_LHS is the vectorized live-out result and MASK is the loop mask for the final iteration. */ gcc_assert (!slp_node); @@ -10548,10 +10589,38 @@ vectorizable_live_operation_1 (loop_vec_info loop_vinfo, tree mask = vect_get_loop_mask (loop_vinfo, &gsi, &LOOP_VINFO_MASKS (loop_vinfo), 1, vectype, 0); + tree scalar_res; + + /* For an inverted control flow with early breaks we want EXTRACT_FIRST + instead of EXTRACT_LAST. Emulate by reversing the vector and mask. */ + if (restart_loop && LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) + { + /* First create the permuted mask. */ + tree perm_mask = perm_mask_for_reverse (TREE_TYPE (mask)); + tree perm_dest = copy_ssa_name (mask); + gimple *perm_stmt + = gimple_build_assign (perm_dest, VEC_PERM_EXPR, mask, + mask, perm_mask); + vect_finish_stmt_generation (loop_vinfo, stmt_info, perm_stmt, + &gsi); + mask = perm_dest; + + /* Then permute the vector contents. */ + tree perm_elem = perm_mask_for_reverse (vectype); + perm_dest = copy_ssa_name (vec_lhs_phi); + perm_stmt + = gimple_build_assign (perm_dest, VEC_PERM_EXPR, vec_lhs_phi, + vec_lhs_phi, perm_elem); + vect_finish_stmt_generation (loop_vinfo, stmt_info, perm_stmt, + &gsi); + vec_lhs_phi = perm_dest; + } gimple_seq_add_seq (&stmts, tem); - tree scalar_res = gimple_build (&stmts, CFN_EXTRACT_LAST, scalar_type, - mask, vec_lhs_phi); + + scalar_res = gimple_build (&stmts, CFN_EXTRACT_LAST, scalar_type, + mask, vec_lhs_phi); + /* Convert the extracted vector element to the scalar type. */ new_tree = gimple_convert (&stmts, lhs_type, scalar_res); } @@ -10564,12 +10633,26 @@ vectorizable_live_operation_1 (loop_vec_info loop_vinfo, new_tree = force_gimple_operand (fold_convert (lhs_type, new_tree), &stmts, true, NULL_TREE); } + *exit_gsi = gsi_after_labels (exit_bb); if (stmts) gsi_insert_seq_before (exit_gsi, stmts, GSI_SAME_STMT); + return new_tree; } +/* Find the edge that's the final one in the path from SRC to DEST and + return it. This edge must exist in at most one forwarder edge between. */ + +static edge +find_connected_edge (edge src, basic_block dest) +{ + if (src->dest == dest) + return src; + + return find_edge (src->dest, dest); +} + /* Function vectorizable_live_operation. STMT_INFO computes a value that is used outside the loop. Check if @@ -10590,11 +10673,13 @@ vectorizable_live_operation (vec_info *vinfo, stmt_vec_info stmt_info, poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype); int ncopies; gimple *use_stmt; + use_operand_p use_p; auto_vec<tree> vec_oprnds; int vec_entry = 0; poly_uint64 vec_index = 0; - gcc_assert (STMT_VINFO_LIVE_P (stmt_info)); + gcc_assert (STMT_VINFO_LIVE_P (stmt_info) + || LOOP_VINFO_EARLY_BREAKS (loop_vinfo)); /* If a stmt of a reduction is live, vectorize it via vect_create_epilog_for_reduction. vectorizable_reduction assessed @@ -10619,8 +10704,25 @@ vectorizable_live_operation (vec_info *vinfo, stmt_vec_info stmt_info, if (STMT_VINFO_REDUC_TYPE (reduc_info) == FOLD_LEFT_REDUCTION || STMT_VINFO_REDUC_TYPE (reduc_info) == EXTRACT_LAST_REDUCTION) return true; + vect_create_epilog_for_reduction (loop_vinfo, stmt_info, slp_node, - slp_node_instance); + slp_node_instance, + LOOP_VINFO_IV_EXIT (loop_vinfo)); + + /* If early break we only have to materialize the reduction on the merge + block, but we have to find an alternate exit first. */ + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) + { + for (auto exit : get_loop_exit_edges (LOOP_VINFO_LOOP (loop_vinfo))) + if (exit != LOOP_VINFO_IV_EXIT (loop_vinfo)) + { + vect_create_epilog_for_reduction (loop_vinfo, stmt_info, + slp_node, slp_node_instance, + exit); + break; + } + } + return true; } @@ -10772,37 +10874,62 @@ vectorizable_live_operation (vec_info *vinfo, stmt_vec_info stmt_info, lhs' = new_tree; */ class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); - basic_block exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; - gcc_assert (single_pred_p (exit_bb)); - - tree vec_lhs_phi = copy_ssa_name (vec_lhs); - gimple *phi = create_phi_node (vec_lhs_phi, exit_bb); - SET_PHI_ARG_DEF (phi, LOOP_VINFO_IV_EXIT (loop_vinfo)->dest_idx, vec_lhs); - - gimple_stmt_iterator exit_gsi; - tree new_tree - = vectorizable_live_operation_1 (loop_vinfo, stmt_info, - LOOP_VINFO_IV_EXIT (loop_vinfo), - vectype, ncopies, slp_node, bitsize, - bitstart, vec_lhs, lhs_type, - &exit_gsi); - - /* Remove existing phis that copy from lhs and create copies - from new_tree. */ - gimple_stmt_iterator gsi; - for (gsi = gsi_start_phis (exit_bb); !gsi_end_p (gsi);) - { - gimple *phi = gsi_stmt (gsi); - if ((gimple_phi_arg_def (phi, 0) == lhs)) + /* Check if we have a loop where the chosen exit is not the main exit, + in these cases for an early break we restart the iteration the vector code + did. For the live values we want the value at the start of the iteration + rather than at the end. */ + edge main_e = LOOP_VINFO_IV_EXIT (loop_vinfo); + bool restart_loop = LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo); + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, lhs) + if (!is_gimple_debug (use_stmt) + && !flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))) + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) { - remove_phi_node (&gsi, false); - tree lhs_phi = gimple_phi_result (phi); - gimple *copy = gimple_build_assign (lhs_phi, new_tree); - gsi_insert_before (&exit_gsi, copy, GSI_SAME_STMT); - } - else - gsi_next (&gsi); - } + edge e = gimple_phi_arg_edge (as_a <gphi *> (use_stmt), + phi_arg_index_from_use (use_p)); + bool main_exit_edge = e == main_e + || find_connected_edge (main_e, e->src); + + /* Early exits have an merge block, we want the merge block itself + so use ->src. For main exit the merge block is the + destination. */ + basic_block dest = main_exit_edge ? main_e->dest : e->src; + gimple *tmp_vec_stmt = vec_stmt; + tree tmp_vec_lhs = vec_lhs; + tree tmp_bitstart = bitstart; + + /* For early exit where the exit is not in the BB that leads + to the latch then we're restarting the iteration in the + scalar loop. So get the first live value. */ + restart_loop = restart_loop || !main_exit_edge; + if (restart_loop + && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def) + { + tmp_vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0]; + tmp_vec_lhs = gimple_get_lhs (tmp_vec_stmt); + tmp_bitstart = build_zero_cst (TREE_TYPE (bitstart)); + } + + gimple_stmt_iterator exit_gsi; + tree new_tree + = vectorizable_live_operation_1 (loop_vinfo, stmt_info, + dest, vectype, ncopies, + slp_node, bitsize, + tmp_bitstart, tmp_vec_lhs, + lhs_type, restart_loop, + &exit_gsi); + + if (gimple_phi_num_args (use_stmt) == 1) + { + auto gsi = gsi_for_stmt (use_stmt); + remove_phi_node (&gsi, false); + tree lhs_phi = gimple_phi_result (use_stmt); + gimple *copy = gimple_build_assign (lhs_phi, new_tree); + gsi_insert_before (&exit_gsi, copy, GSI_SAME_STMT); + } + else + SET_PHI_ARG_DEF (use_stmt, e->dest_idx, new_tree); + } /* There a no further out-of-loop uses of lhs by LC-SSA construction. */ FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, lhs) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b3a09c0a804a38e17ef32b6ce13b98b077459fc7..582c5e678fad802d6e76300fe3c939b9f2978f17 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -342,6 +342,7 @@ is_simple_and_all_uses_invariant (stmt_vec_info stmt_info, - it has uses outside the loop. - it has vdefs (it alters memory). - control stmts in the loop (except for the exit condition). + - it is an induction and we have multiple exits. CHECKME: what other side effects would the vectorizer allow? */ @@ -399,6 +400,19 @@ vect_stmt_relevant_p (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, } } + /* Check if it's an induction and multiple exits. In this case there will be + a usage later on after peeling which is needed for the alternate exit. */ + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo) + && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vec_stmt_relevant_p: induction forced for " + "early break.\n"); + *live_p = true; + + } + if (*live_p && *relevant == vect_unused_in_scope && !is_simple_and_all_uses_invariant (stmt_info, loop_vinfo)) { @@ -1774,7 +1788,7 @@ compare_step_with_zero (vec_info *vinfo, stmt_vec_info stmt_info) /* If the target supports a permute mask that reverses the elements in a vector of type VECTYPE, return that mask, otherwise return null. */ -static tree +tree perm_mask_for_reverse (tree vectype) { poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype); @@ -12720,20 +12734,27 @@ can_vectorize_live_stmts (vec_info *vinfo, stmt_vec_info stmt_info, bool vec_stmt_p, stmt_vector_for_cost *cost_vec) { + loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); if (slp_node) { stmt_vec_info slp_stmt_info; unsigned int i; FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, slp_stmt_info) { - if (STMT_VINFO_LIVE_P (slp_stmt_info) + if ((STMT_VINFO_LIVE_P (slp_stmt_info) + || (loop_vinfo + && LOOP_VINFO_EARLY_BREAKS (loop_vinfo) + && STMT_VINFO_DEF_TYPE (slp_stmt_info) + == vect_induction_def)) && !vectorizable_live_operation (vinfo, slp_stmt_info, slp_node, slp_node_instance, i, vec_stmt_p, cost_vec)) return false; } } - else if (STMT_VINFO_LIVE_P (stmt_info) + else if ((STMT_VINFO_LIVE_P (stmt_info) + || (LOOP_VINFO_EARLY_BREAKS (loop_vinfo) + && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def)) && !vectorizable_live_operation (vinfo, stmt_info, slp_node, slp_node_instance, -1, vec_stmt_p, cost_vec)) diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 15c7f75b1f3c61ab469f1b1970dae9c6ac1a9f55..974f617d54a14c903894dd20d60098ca259c96f2 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2248,6 +2248,7 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); +extern tree perm_mask_for_reverse (tree); extern bool supportable_widening_operation (vec_info*, code_helper, stmt_vec_info, tree, tree, code_helper*, code_helper*,
rb17968.patch
Description: rb17968.patch