On Wed, 23 Oct 2024, Tamar Christina wrote: > Hi All, > > I have been taking a look at boolean handing once more in the vectorizer. > > There are two situation to consider: > > 1. when the boolean being created are created from comparing data inputs > then > for the resulting vector boolean we need to know the vector type and the > precision. In this case, when we have an operation such as NOT on the > data > element, this has to be lowered to XOR because the truncation to the > vector > precision needs to be explicit. > 2. when the boolean being created comes from another boolean operation, then > we don't need to lower NOT, as the precision doesn't change. We don't do > any lowering for these (as denoted in check_bool_pattern) and instead the > precision is copied from the element feeding the boolean statement during > VF analysis. > > For early break gcond lowering in order to correctly handle the second > scenario > above we punted the lowering of VECT_SCALAR_BOOLEAN_TYPE_P comparisons that > were > already in the right shape. e.g. e != 0 where e is a boolean does not need > any > lowering. > > The issue however is that the statement feeding e may need to be lowered in > the > case where it's a data expression. > > This patch changes a bit how we do the lowering. We now always emit an > additional compare. e.g. if the input is; > > if (e != 0) > > where is a boolean we would punt on thi before, but now we generate > > f = e != 0 > if (f != 0) > > We then use the same infrastructre as recog_bool to ask it to lower f, and in > doing so handle and boolean conversions that need to be lowered. > > Because we now guarantee that f is an internal def we can also simplify the > SLP building code. > > When e is a boolean, the precision we build for f needs to reflect the > precision > of the operation feeding e. To get this value we use integer_type_for_mask > the > same way recog_bool does, and if it's defined (e.g. we have a data conversions > somewhere) we pass that precision on instead. This gets us the correct VF > on the newly lowered boolean expressions. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master?
OK. Thanks, Richard. > > Thanks, > Tamar > > gcc/ChangeLog: > > PR tree-optimization/117176 > * tree-vect-patterns.cc (vect_recog_gcond_pattern): Lower all gconds. > * tree-vect-slp.cc (vect_analyze_slp): No longer check for in vect def. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/117176 > * gcc.dg/vect/vect-early-break_130-pr117176.c: New test. > > --- > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..841dcce284dd7cff0c4f0648e6dc57082c8756c1 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-add-options vect_early_break } */ > +/* { dg-require-effective-target vect_early_break } */ > +/* { dg-require-effective-target vect_int } */ > + > +struct ColorSpace { > + int componentCt; > +}; > + > +struct Psnr { > + double psnr[3]; > +}; > + > +int f(struct Psnr psnr, struct ColorSpace colorSpace) { > + int i, hitsTarget = 1; > + > + for (i = 1; i < colorSpace.componentCt && hitsTarget; ++i) > + hitsTarget = !(psnr.psnr[i] < 1); > + > + return hitsTarget; > +} > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc > index > 746f100a0842090953571b535ff05375f46033c0..dc188f7412770bcb2344808d31bfb7e6c981d777 > 100644 > --- a/gcc/tree-vect-patterns.cc > +++ b/gcc/tree-vect-patterns.cc > @@ -5944,21 +5944,31 @@ vect_recog_gcond_pattern (vec_info *vinfo, > if (VECTOR_TYPE_P (scalar_type)) > return NULL; > > - if (code == NE_EXPR > - && zerop (rhs) > - && VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) > - return NULL; > + /* If the input is a boolean then try to figure out the precision that the > + vector type should use. We cannot use the scalar precision as this > would > + later mismatch. This is similar to what recog_bool does. */ > + if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) > + { > + if (tree stype = integer_type_for_mask (lhs, vinfo)) > + scalar_type = stype; > + } > > - tree vecitype = get_vectype_for_scalar_type (vinfo, scalar_type); > - if (vecitype == NULL_TREE) > + tree vectype = get_mask_type_for_scalar_type (vinfo, scalar_type); > + if (vectype == NULL_TREE) > return NULL; > > - tree vectype = truth_type_for (vecitype); > - > tree new_lhs = vect_recog_temp_ssa_var (boolean_type_node, NULL); > gimple *new_stmt = gimple_build_assign (new_lhs, code, lhs, rhs); > append_pattern_def_seq (vinfo, stmt_vinfo, new_stmt, vectype, scalar_type); > > + hash_set<gimple *> bool_stmts; > + /* This will only be entered if we have a data comparison in the gcond. */ > + if (check_bool_pattern (new_lhs, vinfo, bool_stmts)) > + { > + scalar_type = TREE_TYPE (new_lhs); > + new_lhs = adjust_bool_stmts (vinfo, bool_stmts, scalar_type, > stmt_vinfo); > + } > + > gimple *pattern_stmt > = gimple_build_cond (NE_EXPR, new_lhs, > build_int_cst (TREE_TYPE (new_lhs), 0), > @@ -6208,7 +6218,6 @@ vect_recog_bool_pattern (vec_info *vinfo, > return NULL; > } > > - > /* A helper for vect_recog_mask_conversion_pattern. Build > conversion of MASK to a type suitable for masking VECTYPE. > Built statement gets required vectype and is appended to > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index > d35c2ea02dcef1c295bf97adc6717a7294d3f61e..ebd3ae16cf5cb5c1e85c9b2dced23fa1a1d76645 > 100644 > --- a/gcc/tree-vect-slp.cc > +++ b/gcc/tree-vect-slp.cc > @@ -4858,37 +4858,16 @@ vect_analyze_slp (vec_info *vinfo, unsigned > max_tree_size, > from them. It's highly likely that the resulting SLP tree here if > both > arguments have a def will be incompatible, but we rely on it being > split > later on. */ > - if (auto varg = loop_vinfo->lookup_def (args0)) > - { > - vec<stmt_vec_info> stmts; > - vec<tree> remain = vNULL; > - stmts.create (1); > - stmts.quick_push (vect_stmt_to_vectorize (varg)); > + auto varg = loop_vinfo->lookup_def (args0); > + vec<stmt_vec_info> stmts; > + vec<tree> remain = vNULL; > + stmts.create (1); > + stmts.quick_push (vect_stmt_to_vectorize (varg)); > > - vect_build_slp_instance (vinfo, slp_inst_kind_gcond, > - stmts, roots, remain, > - max_tree_size, &limit, > - bst_map, NULL, force_single_lane); > - } > - else > - { > - /* Create a new SLP instance. */ > - slp_instance new_instance = XNEW (class _slp_instance); > - vec<tree> ops; > - ops.create (1); > - ops.quick_push (args0); > - slp_tree invnode = vect_create_new_slp_node (ops); > - SLP_TREE_DEF_TYPE (invnode) = vect_external_def; > - SLP_INSTANCE_TREE (new_instance) = invnode; > - SLP_INSTANCE_LOADS (new_instance) = vNULL; > - SLP_INSTANCE_ROOT_STMTS (new_instance) = roots; > - SLP_INSTANCE_REMAIN_DEFS (new_instance) = vNULL; > - SLP_INSTANCE_KIND (new_instance) = slp_inst_kind_gcond; > - new_instance->reduc_phis = NULL; > - new_instance->cost_vec = vNULL; > - new_instance->subgraph_entries = vNULL; > - vinfo->slp_instances.safe_push (new_instance); > - } > + vect_build_slp_instance (vinfo, slp_inst_kind_gcond, > + stmts, roots, remain, > + max_tree_size, &limit, > + bst_map, NULL, force_single_lane); > } > > /* Find and create slp instances for inductions that have been forced > > > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)