On Thu, May 06, 2021 at 04:21:40PM +0200, Tobias Burnus wrote: > * omp-low.c (lower_rec_simd_input_clauses): Set max_vf = 1 if > a truth_value_p reduction variable is nonintegral. > (lower_rec_input_clauses): Also handle SIMT part > for complex/float recution with && and ||.
s/recution/reduction/ > --- a/gcc/omp-low.c > +++ b/gcc/omp-low.c > @@ -4389,14 +4389,28 @@ lower_rec_simd_input_clauses (tree new_var, > omp_context *ctx, > { > for (tree c = gimple_omp_for_clauses (ctx->stmt); c; > c = OMP_CLAUSE_CHAIN (c)) > - if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION > - && OMP_CLAUSE_REDUCTION_PLACEHOLDER (c)) > - { > - /* UDR reductions are not supported yet for SIMT, disable > - SIMT. */ > - sctx->max_vf = 1; > - break; > + { > + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_REDUCTION) > + continue; > + > + if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c)) > + { > + /* UDR reductions are not supported yet for SIMT, disable > + SIMT. */ > + sctx->max_vf = 1; > + break; > + } > + > + if (truth_value_p (OMP_CLAUSE_REDUCTION_CODE (c)) > + && !INTEGRAL_TYPE_P (TREE_TYPE (new_var))) > + { > + /* Doing boolean operations on non-boolean types is > + for conformance only, it's not worth supporting this > + for SIMT. */ This comment needs to be adjusted to talk about non-integral types. > + sctx->max_vf = 1; > + break; > } > + } > } > if (maybe_gt (sctx->max_vf, 1U)) > { > @@ -6432,28 +6446,34 @@ lower_rec_input_clauses (tree clauses, gimple_seq > *ilist, gimple_seq *dlist, > > gimplify_assign (unshare_expr (ivar), x, &llist[0]); > > - if (sctx.is_simt) > - { > - if (!simt_lane) > - simt_lane = create_tmp_var (unsigned_type_node); > - x = build_call_expr_internal_loc > - (UNKNOWN_LOCATION, IFN_GOMP_SIMT_XCHG_BFLY, > - TREE_TYPE (ivar), 2, ivar, simt_lane); > - x = build2 (code, TREE_TYPE (ivar), ivar, x); > - gimplify_assign (ivar, x, &llist[2]); > - } > tree ivar2 = ivar; > tree ref2 = ref; > + tree zero = NULL_TREE; > if (is_fp_and_or) > { > - tree zero = build_zero_cst (TREE_TYPE (ivar)); > + zero = build_zero_cst (TREE_TYPE (ivar)); > ivar2 = fold_build2_loc (clause_loc, NE_EXPR, > integer_type_node, ivar, > zero); > ref2 = fold_build2_loc (clause_loc, NE_EXPR, > integer_type_node, ref, zero); > } > - x = build2 (code, TREE_TYPE (ref), ref2, ivar2); > + if (sctx.is_simt) > + { > + if (!simt_lane) > + simt_lane = create_tmp_var (unsigned_type_node); > + x = build_call_expr_internal_loc > + (UNKNOWN_LOCATION, IFN_GOMP_SIMT_XCHG_BFLY, > + TREE_TYPE (ivar), 2, ivar, simt_lane); > + if (is_fp_and_or) > + x = fold_build2_loc (clause_loc, NE_EXPR, > + integer_type_node, x, zero); > + x = build2 (code, TREE_TYPE (ivar2), ivar2, x); > + if (is_fp_and_or) > + x = fold_convert (TREE_TYPE (ivar), x); > + gimplify_assign (ivar, x, &llist[2]); > + } > + x = build2 (code, TREE_TYPE (ref2), ref2, ivar2); > if (is_fp_and_or) > x = fold_convert (TREE_TYPE (ref), x); > ref = build_outer_var_ref (var, ctx); Is this hunk still needed when the first hunk is in? I mean, this is in code guarded with is_simd && lower_rec_simd_input_clauses (...) and that function will return false for if (known_eq (sctx->max_vf, 1U)) which the first hunk ensures. So sctx.is_simt && is_fp_and_or shouldn't be true in that code. Jakub