On 06/22/2016 10:09 AM, Ilya Enkovich wrote:
Given the common structure & duplication I can't help but wonder if a single
function should be used for widening/narrowing. Ultimately can't you swap
mask_elems/req_elems and always go narrower to wider (using a different
optab for the two different cases)?
I think we can't always go in narrower to wider direction because widening
uses two optabs wand also because the way insn_data is checked.
OK. Thanks for considering.
I'm guessing Richi's comment about what tree type you're looking at refers
to this and similar instances. Doesn't this give you the type of the number
of iterations rather than the type of the iteration variable itself?
Since I build vector IV by myself and use to compare with NITERS I
feel it's safe to
use type of NITERS. Do you expect NITERS and IV types differ?
Since you're comparing to NITERS, it sounds like you've got it right and
that Richi and I have it wrong.
It's less a question of whether or not we expect NITERS and IV to have
different types, but more a realization that there's nothing that
inherently says they have to be the same. THey probably are the same
most of the time, but I don't think that's something we can or should
necessarily depend on.
@@ -1791,6 +1870,20 @@ vectorizable_mask_load_store (gimple *stmt,
gimple_stmt_iterator *gsi,
&& !useless_type_conversion_p (vectype, rhs_vectype)))
return false;
+ if (LOOP_VINFO_CAN_BE_MASKED (loop_vinfo))
+ {
+ /* Check that mask conjuction is supported. */
+ optab tab;
+ tab = optab_for_tree_code (BIT_AND_EXPR, vectype, optab_default);
+ if (!tab || optab_handler (tab, TYPE_MODE (vectype)) ==
CODE_FOR_nothing)
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+ "cannot be masked: unsupported mask
operation\n");
+ LOOP_VINFO_CAN_BE_MASKED (loop_vinfo) = false;
+ }
+ }
Should the optab querying be in optab-query.c?
We always directly call optab_handler for simple operations. There are dozens
of such calls in vectorizer.
OK. I would look favorably on a change to move those queries out into
optabs-query as a separate patch.
We don't embed masking capabilities into vectorizer.
Actually we don't depend on masking capabilities so much. We have to mask
loads and stores and use can_mask_load_store for that which uses existing optab
query. We also require masking for reductions and use VEC_COND for that
(and use existing expand_vec_cond_expr_p). Other checks are to check if we
can build required masks. So we actually don't expose any new processor
masking capabilities to GIMPLE. I.e. all this works on targets with no
rich masking capabilities. E.g. we can mask loops for quite old SSE targets.
OK. I think the key here is that load/store masking already exists and
the others are either VEC_COND or checking if we can build the mask
rather than can the operation be masked. THanks for clarifying.
jeff