Hi, As you know I've been working on removing the code that demotes GIMPLE COND_EXPR to GENERIC during vect_recog_bool_pattern.
To restate why, The issue we currently have today is that the mask (boolean argument of a COND_EXPR) is not always available during pattern matching. This is a problem as it means we can't do mask optimization early on and the it prevents us from writing patterns that consume these masks, such as my vect_recog_cond_store_pattern. What's happening here is that normally in GIMPLE we would have this sequence: a = <bool expression> d = a ? b : c Here a will be expected to be the mask. vect_recog_mask_conversion_pattern will handle this and we can refer to a and get the appropriate mask out. But there are some cases where in gimple a is not a mask, or even a boolean type. An example is when a is an external. The vectorizer then just sees d = a ? b : c and in order to create a mask lowers this into d = a == 0 ? b : c Two interesting things happen in this case. 1. the operand is no longer a mask, so it can't be used by any optimization. 2. it bypasses most of the validation in the vectorizer. Specifically it skips the checks imposed by vect_maybe_update_slp_op_vectype and vect_create_constant_vectors creates the right invariant conversions at the end of vectorization. The goal is to make this all explicit. It means that the mask lowering should stay in GIMPLE and explicitly say how to convert the argument to a mask. For non-boolean this is simple, essentially we transform d = a ? b : c into a1 = a != 0 d = a1 ? b : c This works, but does have a downside, we cost a1, and we vectorize a1. But it's loop invariant so we didn't really need to. i.e. we could have done a scalar compare and split, which is what vect_create_constant_vectors currently does. The bigger issue is boolean arguments. canonically the way we can transform the boolean into a mask is by transforming d = a ? b : c into a1 = a ? -1 : 0 a2 = a1 != 0 d = a2 ? b : c But this introduces a chicken and egg problem.. It re-introduces a COND_EXPR and so vectorization would fail as we have no way to vectorize a1. My current code for this is: tree lhs_var = vect_recog_temp_ssa_var (boolean_type_node, NULL); tree cond_type = TREE_TYPE (var); /* If the condition is already a boolean then manually convert it to a mask of the given integer type but don't set a vectype. */ if (TREE_CODE (cond_type) == BOOLEAN_TYPE && TREE_CODE (var) == SSA_NAME) { tree lhs_ivar = vect_recog_temp_ssa_var (type, NULL); tree true_val = build_all_ones_cst (type); tree false_val = build_zero_cst (type); pattern_stmt = gimple_build_assign (lhs_ivar, COND_EXPR, var, true_val, false_val); append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt); var = lhs_ivar; cond_type = type; } pattern_stmt = gimple_build_assign (lhs_var, NE_EXPR, var, build_zero_cst (cond_type)); Now to address this, without creating hacks into the vectorizer, would the right approach be to teach the vectorizer to recognize loop invariant computations? This would then not need to be vectorizered and can simply be lifted into the loop pre-header pred similar to what vect_create_constant_vectors does. If you agree, what's the best way? Should I teach analysis this through a new vect_used_invariant_scope or something? Or should I add a new gimple_seq to VINFO? or something else? Thanks, Tamar