[RFC] early vector boolean lowering

Tamar Christina Tue, 20 Aug 2024 03:53:54 -0700

Hi,

As you know I've been working on removing the code that demotes GIMPLE
COND_EXPR to GENERIC during vect_recog_bool_pattern.


To restate why, The issue we currently have today is that the mask (boolean
argument of a COND_EXPR) is not always available during pattern matching.

This is a problem as it means we can't do mask optimization early on and the
it prevents us from writing patterns that consume these masks, such as my
vect_recog_cond_store_pattern.

What's happening here is that normally in GIMPLE we would have this sequence:

  a = <bool expression>
  d = a ? b : c

Here a will be expected to be the mask. vect_recog_mask_conversion_pattern will
handle this and we can refer to a and get the appropriate mask out.

But there are some cases where in gimple a is not a mask, or even a boolean
type.  An example is when a is an external.

The vectorizer then just sees

  d = a ? b : c

and in order to create a mask lowers this into

  d = a == 0 ? b : c

Two interesting things happen in this case.

1. the operand is no longer a mask, so it can't be used by any optimization.
2. it bypasses most of the validation in the vectorizer.  Specifically it skips
   the checks imposed by vect_maybe_update_slp_op_vectype and
   vect_create_constant_vectors creates the right invariant conversions at the
   end of vectorization.

The goal is to make this all explicit.  It means that the mask lowering should
stay in GIMPLE and explicitly say how to convert the argument to a mask.

For non-boolean this is simple, essentially we transform

  d = a ? b : c

into

  a1 = a != 0
  d = a1 ? b : c

This works, but does have a downside, we cost a1, and we vectorize a1.  But
it's loop invariant so we didn't really need to. i.e. we could have done a
scalar compare and split, which is what vect_create_constant_vectors currently
does.

The bigger issue is boolean arguments.
canonically the way we can transform the boolean into a mask is by transforming


  d = a ? b : c

into

  a1 = a ? -1 : 0
  a2 = a1 != 0
  d = a2 ? b : c

But this introduces a chicken and egg problem..  It re-introduces a COND_EXPR
and so vectorization would fail as we have no way to vectorize a1.

My current code for this is:

      tree lhs_var = vect_recog_temp_ssa_var (boolean_type_node, NULL);
      tree cond_type = TREE_TYPE (var);

      /* If the condition is already a boolean then manually convert it to a 
mask of
         the given integer type but don't set a vectype.  */
      if (TREE_CODE (cond_type) == BOOLEAN_TYPE
          && TREE_CODE (var) == SSA_NAME)
        {
          tree lhs_ivar = vect_recog_temp_ssa_var (type, NULL);
          tree true_val = build_all_ones_cst (type);
          tree false_val = build_zero_cst (type);
          pattern_stmt = gimple_build_assign (lhs_ivar, COND_EXPR, var,
                                             true_val, false_val);
          append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt);
          var = lhs_ivar;
          cond_type = type;
        }

      pattern_stmt = gimple_build_assign (lhs_var, NE_EXPR, var,
                                          build_zero_cst (cond_type));

Now to address this, without creating hacks into the vectorizer, would the right
approach be to teach the vectorizer to recognize loop invariant computations?

This would then not need to be vectorizered and can simply be lifted into the
loop pre-header pred similar to what vect_create_constant_vectors does.

If you agree, what's the best way?  Should I teach analysis this through a new
vect_used_invariant_scope or something?

Or should I add a new gimple_seq to VINFO? or something else?

Thanks,
Tamar

[RFC] early vector boolean lowering

Reply via email to