On Wed, Nov 6, 2019 at 1:38 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford > > <richard.sandif...@arm.com> wrote: > >> > >> After previous patches, it's now possible to make the vectoriser > >> support multiple vector sizes in the same vector region, using > >> related_vector_mode to pick the right vector mode for a given > >> element mode. No port yet takes advantage of this, but I have > >> a follow-on patch for AArch64. > >> > >> This patch also seemed like a good opportunity to add some more dump > >> messages: one to make it clear which vector size/mode was being used > >> when analysis passed or failed, and another to say when we've decided > >> to skip a redundant vector size/mode. > > > > OK. > > > > I wonder if, when we requested a specific size previously, we now > > have to verify we got that constraint satisfied after the change. > > Esp. the epilogue vectorization cases want to get V2DI > > from V4DI. > > > > sz /= 2; > > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE > > (vectype), > > + scalar_type, > > + sz / > > scalar_bytes); > > > > doesn't look like an improvement in readability to me there. > > Yeah, guess it isn't great. > > > Maybe re-formulating the whole code in terms of lanes instead of size > > would make it easier to follow? > > OK, how about this version? It still won't win awards, but it's at > least a bit more readable. > > Tested as before.
OK (and sorry for the delay, looking for leftovers of the series now). Thanks, Richard. > Richard > > > 2019-11-06 Richard Sandiford <richard.sandif...@arm.com> > > gcc/ > * machmode.h (opt_machine_mode::operator==): New function. > (opt_machine_mode::operator!=): Likewise. > * tree-vectorizer.h (vec_info::vector_mode): Update comment. > (get_related_vectype_for_scalar_type): Delete. > (get_vectype_for_scalar_type_and_size): Declare. > * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say > whether analysis passed or failed, and with what vector modes. > Use related_vector_mode to check whether trying a particular > vector mode would be redundant with the autodetected mode, > and print a dump message if we decide to skip it. > * tree-vect-loop.c (vect_analyze_loop): Likewise. > (vect_create_epilog_for_reduction): Use > get_related_vectype_for_scalar_type instead of > get_vectype_for_scalar_type_and_size. > * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace > with... > (get_related_vectype_for_scalar_type): ...this new function. > Take a starting/"prevailing" vector mode rather than a vector size. > Take an optional nunits argument, with the same meaning as for > related_vector_mode. Use related_vector_mode when not > auto-detecting a mode, falling back to mode_for_vector if no > target mode exists. > (get_vectype_for_scalar_type): Update accordingly. > (get_same_sized_vectype): Likewise. > * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. > > Index: gcc/machmode.h > =================================================================== > --- gcc/machmode.h 2019-11-06 12:35:12.460201615 +0000 > +++ gcc/machmode.h 2019-11-06 12:35:27.972093472 +0000 > @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) > bool exists () const; > template<typename U> bool exists (U *) const; > > + bool operator== (const T &m) const { return m_mode == m; } > + bool operator!= (const T &m) const { return m_mode != m; } > + > private: > machine_mode m_mode; > }; > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2019-11-06 12:35:12.764199495 +0000 > +++ gcc/tree-vectorizer.h 2019-11-06 12:35:27.976093444 +0000 > @@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object > /* Cost data used by the target cost model. */ > void *target_cost_data; > > - /* If we've chosen a vector size for this vectorization region, > - this is one mode that has such a size, otherwise it is VOIDmode. */ > + /* The argument we should pass to related_vector_mode when looking up > + the vector mode for a scalar mode, or VOIDmode if we haven't yet > + made any decisions about which vector modes to use. */ > machine_mode vector_mode; > > private: > @@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop > extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); > > /* In tree-vect-stmts.c. */ > +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, > + poly_uint64 = 0); > extern tree get_vectype_for_scalar_type (vec_info *, tree); > -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); > extern tree get_mask_type_for_scalar_type (vec_info *, tree); > extern tree get_same_sized_vectype (tree, tree); > extern bool vect_get_loop_mask_type (loop_vec_info); > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c 2019-11-06 12:35:12.760199523 +0000 > +++ gcc/tree-vect-slp.c 2019-11-06 12:35:27.972093472 +0000 > @@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator > && dbg_cnt (vect_slp)) > { > if (dump_enabled_p ()) > - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + { > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode" > + " %s\n", GET_MODE_NAME > (bb_vinfo->vector_mode)); > + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + } > > bb_vinfo->shared->check_datarefs (); > vect_schedule_slp (bb_vinfo); > @@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator > > vectorized = true; > } > + else > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (bb_vinfo->vector_mode)); > + } > > if (mode_i == 0) > autodetected_vector_mode = bb_vinfo->vector_mode; > @@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator > delete bb_vinfo; > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (vectorized > || mode_i == vector_modes.length () > Index: gcc/tree-vect-loop.c > =================================================================== > --- gcc/tree-vect-loop.c 2019-11-06 12:35:12.756199552 +0000 > +++ gcc/tree-vect-loop.c 2019-11-06 12:35:27.972093472 +0000 > @@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec > res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); > if (mode_i == 0) > autodetected_vector_mode = loop_vinfo->vector_mode; > + if (dump_enabled_p ()) > + { > + if (res) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + else > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + } > > loop->aux = NULL; > if (res) > @@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec > } > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (mode_i == vector_modes.length () > || autodetected_vector_mode == VOIDmode) > @@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v > in a vector mode of smaller size and first reduce upper/lower > halves against each other. */ > enum machine_mode mode1 = mode; > - unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype)); > - unsigned sz1 = sz; > + unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant (); > + unsigned nunits1 = nunits; > if (!slp_reduc > && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) > - sz1 = GET_MODE_SIZE (mode1).to_constant (); > + nunits1 = GET_MODE_NUNITS (mode1).to_constant (); > > - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, > sz1); > + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE > (vectype), > + scalar_type, > + nunits1); > reduce_with_shift = have_whole_vector_shift (mode1); > if (!VECTOR_MODE_P (mode1)) > reduce_with_shift = false; > @@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v > /* First reduce the vector to the desired vector size we should > do shift reduction on by combining upper and lower halves. */ > new_temp = new_phi_result; > - while (sz > sz1) > + while (nunits > nunits1) > { > gcc_assert (!slp_reduc); > - sz /= 2; > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > + nunits /= 2; > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > nunits); > + unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1)); > > /* The target has to make sure we support lowpart/highpart > extraction, either via direct vector extract or through > @@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v > = gimple_build_assign (dst2, BIT_FIELD_REF, > build3 (BIT_FIELD_REF, vectype1, > new_temp, TYPE_SIZE > (vectype1), > - bitsize_int (sz * > BITS_PER_UNIT))); > + bitsize_int (bitsize))); > gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > } > else > { > /* Extract via punning to appropriately sized integer mode > vector. */ > - tree eltype = build_nonstandard_integer_type (sz * > BITS_PER_UNIT, > - 1); > + tree eltype = build_nonstandard_integer_type (bitsize, 1); > tree etype = build_vector_type (eltype, 2); > gcc_assert (convert_optab_handler (vec_extract_optab, > TYPE_MODE (etype), > @@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v > = gimple_build_assign (tem, BIT_FIELD_REF, > build3 (BIT_FIELD_REF, eltype, > new_temp, TYPE_SIZE (eltype), > - bitsize_int (sz * > BITS_PER_UNIT))); > + bitsize_int (bitsize))); > gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > dst2 = make_ssa_name (vectype1); > epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR, > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2019-11-06 12:35:12.796199272 +0000 > +++ gcc/tree-vect-stmts.c 2019-11-06 12:35:27.976093444 +0000 > @@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_ > } > } > > -/* Function get_vectype_for_scalar_type_and_size. > - > - Returns the vector type corresponding to SCALAR_TYPE and SIZE as > supported > - by the target. */ > +/* If NUNITS is nonzero, return a vector type that contains NUNITS > + elements of type SCALAR_TYPE, or null if the target doesn't support > + such a type. > + > + If NUNITS is zero, return a vector type that contains elements of > + type SCALAR_TYPE, choosing whichever vector size the target prefers. > + > + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode > + for this vectorization region and want to "autodetect" the best choice. > + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE > + and we want the new type to be interoperable with it. PREVAILING_MODE > + in this case can be a scalar integer mode or a vector mode; when it > + is a vector mode, the function acts like a tree-level version of > + related_vector_mode. */ > > tree > -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) > +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, > + tree scalar_type, poly_uint64 nunits) > { > tree orig_scalar_type = scalar_type; > scalar_mode inner_mode; > machine_mode simd_mode; > - poly_uint64 nunits; > tree vectype; > > if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) > @@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr > if (scalar_type == NULL_TREE) > return NULL_TREE; > > - /* If no size was supplied use the mode the target prefers. Otherwise > - lookup a vector mode of the specified size. */ > - if (known_eq (size, 0U)) > + /* If no prevailing mode was supplied, use the mode the target prefers. > + Otherwise lookup a vector mode based on the prevailing mode. */ > + if (prevailing_mode == VOIDmode) > { > + gcc_assert (known_eq (nunits, 0U)); > simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); > if (SCALAR_INT_MODE_P (simd_mode)) > { > @@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr > return NULL_TREE; > } > } > - else if (!multiple_p (size, nbytes, &nunits) > - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > - return NULL_TREE; > + else if (SCALAR_INT_MODE_P (prevailing_mode) > + || !related_vector_mode (prevailing_mode, > + inner_mode, nunits).exists (&simd_mode)) > + { > + /* Fall back to using mode_for_vector, mostly in the hope of being > + able to use an integer mode. */ > + if (known_eq (nunits, 0U) > + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) > + return NULL_TREE; > + > + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > + return NULL_TREE; > + } > > vectype = build_vector_type_for_mode (scalar_type, simd_mode); > > @@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr > tree > get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) > { > - tree vectype; > - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); > - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); > + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, > + scalar_type); > if (vectype && vinfo->vector_mode == VOIDmode) > vinfo->vector_mode = TYPE_MODE (vectype); > return vectype; > @@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type > if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) > return truth_type_for (vector_type); > > - return get_vectype_for_scalar_type_and_size > - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); > + poly_uint64 nunits; > + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), > + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) > + return NULL_TREE; > + > + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), > + scalar_type, nunits); > } > > /* Function vect_is_simple_use. > Index: gcc/tree-vectorizer.c > =================================================================== > --- gcc/tree-vectorizer.c 2019-11-06 12:35:12.764199495 +0000 > +++ gcc/tree-vectorizer.c 2019-11-06 12:35:27.976093444 +0000 > @@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t > poly_uint64 array_size, vector_size; > > tree scalar_type = strip_array_types (type); > - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); > + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); > if (!vectype > || !poly_int_tree_p (TYPE_SIZE (type), &array_size) > || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)