Re: [PATCH]middle-end: Lower all gconds during vector pattern matching [PR117176]

Richard Biener Fri, 25 Oct 2024 04:30:59 -0700

On Wed, 23 Oct 2024, Tamar Christina wrote:

> Hi All,
> 
> I have been taking a look at boolean handing once more in the vectorizer.
> 
> There are two situation to consider:
> 
>   1. when the boolean being created are created from comparing data inputs 
> then
>      for the resulting vector boolean we need to know the vector type and the
>      precision.  In this case, when we have an operation such as NOT on the 
> data
>      element, this has to be lowered to XOR because the truncation to the 
> vector
>      precision needs to be explicit.
>   2. when the boolean being created comes from another boolean operation, then
>      we don't need to lower NOT, as the precision doesn't change.  We don't do
>      any lowering for these (as denoted in check_bool_pattern) and instead the
>      precision is copied from the element feeding the boolean statement during
>      VF analysis.
> 
> For early break gcond lowering in order to correctly handle the second 
> scenario
> above we punted the lowering of VECT_SCALAR_BOOLEAN_TYPE_P comparisons that 
> were
> already in the right shape.  e.g. e != 0 where e is a boolean does not need 
> any
> lowering.
> 
> The issue however is that the statement feeding e may need to be lowered in 
> the
> case where it's a data expression.
> 
> This patch changes a bit how we do the lowering.  We now always emit an
> additional compare. e.g. if the input is;
> 
>   if (e != 0)
> 
> where is a boolean we would punt on thi before, but now we generate
> 
>   f = e != 0
>   if (f != 0)
> 
> We then use the same infrastructre as recog_bool to ask it to lower f, and in
> doing so handle and boolean conversions that need to be lowered.
> 
> Because we now guarantee that f is an internal def we can also simplify the
> SLP building code.
> 
> When e is a boolean, the precision we build for f needs to reflect the 
> precision
> of the operation feeding e.  To get this value we use integer_type_for_mask 
> the
> same way recog_bool does, and if it's defined (e.g. we have a data conversions
> somewhere) we pass that precision on instead.  This gets us the correct VF
> on the newly lowered boolean expressions.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?


OK.

Thanks,
Richard.

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       PR tree-optimization/117176
>       * tree-vect-patterns.cc (vect_recog_gcond_pattern): Lower all gconds.
>       * tree-vect-slp.cc (vect_analyze_slp): No longer check for in vect def.
> 
> gcc/testsuite/ChangeLog:
> 
>       PR tree-optimization/117176
>       * gcc.dg/vect/vect-early-break_130-pr117176.c: New test.
> 
> ---
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..841dcce284dd7cff0c4f0648e6dc57082c8756c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +struct ColorSpace {
> +  int componentCt;
> +};
> +
> +struct Psnr {
> +  double psnr[3];
> +};
> +
> +int f(struct Psnr psnr, struct ColorSpace colorSpace) {
> +  int i, hitsTarget = 1;
> +
> +  for (i = 1; i < colorSpace.componentCt && hitsTarget; ++i)
> +    hitsTarget = !(psnr.psnr[i] < 1);
> +
> +  return hitsTarget;
> +}
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 
> 746f100a0842090953571b535ff05375f46033c0..dc188f7412770bcb2344808d31bfb7e6c981d777
>  100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -5944,21 +5944,31 @@ vect_recog_gcond_pattern (vec_info *vinfo,
>    if (VECTOR_TYPE_P (scalar_type))
>      return NULL;
>  
> -  if (code == NE_EXPR
> -      && zerop (rhs)
> -      && VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
> -    return NULL;
> +  /* If the input is a boolean then try to figure out the precision that the
> +     vector type should use.  We cannot use the scalar precision as this 
> would
> +     later mismatch.  This is similar to what recog_bool does.  */
> +  if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
> +    {
> +      if (tree stype = integer_type_for_mask (lhs, vinfo))
> +     scalar_type = stype;
> +    }
>  
> -  tree vecitype = get_vectype_for_scalar_type (vinfo, scalar_type);
> -  if (vecitype == NULL_TREE)
> +  tree vectype = get_mask_type_for_scalar_type (vinfo, scalar_type);
> +  if (vectype == NULL_TREE)
>      return NULL;
>  
> -  tree vectype = truth_type_for (vecitype);
> -
>    tree new_lhs = vect_recog_temp_ssa_var (boolean_type_node, NULL);
>    gimple *new_stmt = gimple_build_assign (new_lhs, code, lhs, rhs);
>    append_pattern_def_seq (vinfo, stmt_vinfo, new_stmt, vectype, scalar_type);
>  
> +  hash_set<gimple *> bool_stmts;
> +  /* This will only be entered if we have a data comparison in the gcond.  */
> +  if (check_bool_pattern (new_lhs, vinfo, bool_stmts))
> +    {
> +      scalar_type = TREE_TYPE (new_lhs);
> +      new_lhs = adjust_bool_stmts (vinfo, bool_stmts, scalar_type, 
> stmt_vinfo);
> +    }
> +
>    gimple *pattern_stmt
>      = gimple_build_cond (NE_EXPR, new_lhs,
>                        build_int_cst (TREE_TYPE (new_lhs), 0),
> @@ -6208,7 +6218,6 @@ vect_recog_bool_pattern (vec_info *vinfo,
>      return NULL;
>  }
>  
> -
>  /* A helper for vect_recog_mask_conversion_pattern.  Build
>     conversion of MASK to a type suitable for masking VECTYPE.
>     Built statement gets required vectype and is appended to
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 
> d35c2ea02dcef1c295bf97adc6717a7294d3f61e..ebd3ae16cf5cb5c1e85c9b2dced23fa1a1d76645
>  100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -4858,37 +4858,16 @@ vect_analyze_slp (vec_info *vinfo, unsigned 
> max_tree_size,
>            from them.  It's highly likely that the resulting SLP tree here if 
> both
>            arguments have a def will be incompatible, but we rely on it being 
> split
>            later on.  */
> -       if (auto varg = loop_vinfo->lookup_def (args0))
> -         {
> -           vec<stmt_vec_info> stmts;
> -           vec<tree> remain = vNULL;
> -           stmts.create (1);
> -           stmts.quick_push (vect_stmt_to_vectorize (varg));
> +       auto varg = loop_vinfo->lookup_def (args0);
> +       vec<stmt_vec_info> stmts;
> +       vec<tree> remain = vNULL;
> +       stmts.create (1);
> +       stmts.quick_push (vect_stmt_to_vectorize (varg));
>  
> -           vect_build_slp_instance (vinfo, slp_inst_kind_gcond,
> -                                    stmts, roots, remain,
> -                                    max_tree_size, &limit,
> -                                    bst_map, NULL, force_single_lane);
> -         }
> -       else
> -         {
> -           /* Create a new SLP instance.  */
> -           slp_instance new_instance = XNEW (class _slp_instance);
> -           vec<tree> ops;
> -           ops.create (1);
> -           ops.quick_push (args0);
> -           slp_tree invnode = vect_create_new_slp_node (ops);
> -           SLP_TREE_DEF_TYPE (invnode) = vect_external_def;
> -           SLP_INSTANCE_TREE (new_instance) = invnode;
> -           SLP_INSTANCE_LOADS (new_instance) = vNULL;
> -           SLP_INSTANCE_ROOT_STMTS (new_instance) = roots;
> -           SLP_INSTANCE_REMAIN_DEFS (new_instance) = vNULL;
> -           SLP_INSTANCE_KIND (new_instance) = slp_inst_kind_gcond;
> -           new_instance->reduc_phis = NULL;
> -           new_instance->cost_vec = vNULL;
> -           new_instance->subgraph_entries = vNULL;
> -           vinfo->slp_instances.safe_push (new_instance);
> -         }
> +       vect_build_slp_instance (vinfo, slp_inst_kind_gcond,
> +                                stmts, roots, remain,
> +                                max_tree_size, &limit,
> +                                bst_map, NULL, force_single_lane);
>       }
>  
>       /* Find and create slp instances for inductions that have been forced
> 
> 
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH]middle-end: Lower all gconds during vector pattern matching [PR117176]

Reply via email to