On Thu, 29 Feb 2024, Richard Biener wrote: > The following amends the PR114070 fix to optimistically allow > the folding when we cannot expand the current vec_cond using > vcond_mask and we're still before vector lowering. This leaves > a small window between vectorization and lowering where we could > break vec_conds that can be expanded via vcond{,u,eq}, most > susceptible is the loop unrolling pass which applies VN and thus > possibly folding to the unrolled body of a vectorized loop. > > This gets back the folding for targets that cannot do vectorization. > It doesn't get back the folding for x86 with AVX512 for example > since that can handle the original IL but not the folded since > it misses some vcond_mask expanders. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > As said for stage1 I want to move vector lowering before vectorization. > While I'm not entirely happy with this patch it forces us into the > correct direction, getting vcond_mask and vcmp{,u,eq} patterns > implemented. We could use canonicalize_math_p () to close the > vectorizer -> vector lowering gap but this only works when that > pass is run (not with -Og or when disabled). We could add a new > PROP_vectorizer_il and disable the folding if the vectorizer ran. > > Or we could simply live with the regression. > > Any preferences?
I've tried moving vector lowering, first try to after the first forwprop after IPA. That exposes (at least) invariant motion creating unsupported COND_EXPRs - we hoist a vector PHI as _2 ? _3 : _6 and that might lead to unsupported BLKmode moves. I think there's some latent issues to be fixed in passes. A more conservative move is to duplicate vector lowering into the loop/non-loop sections and put it right before vectorization (but there's invariant motion after it, so the above issue will prevail). Since the vectorizer currently cannot handle existing vector code "re-vectorization" is best done on lowered code (after that got some cleanup). Also SLP can interface with existing vectors, but currently only non-BLKmode ones, so that would benefit as well. That said, the quick experiment shows this isn't anything for stage4. Richard. > Thanks, > Richard. > > PR middle-end/114070 > * match.pd ((c ? a : b) op d --> c ? (a op d) : (b op d)): > Allow the folding if before lowering and the current IL > isn't supported with vcond_mask. > --- > gcc/match.pd | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/gcc/match.pd b/gcc/match.pd > index f3fffd8dec2..4edba7c84fb 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -5153,7 +5153,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4)) > (if (TREE_CODE_CLASS (op) != tcc_comparison > || types_match (type, TREE_TYPE (@1)) > - || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) > + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK) > + || (optimize_vectors_before_lowering_p () > + /* The following is optimistic on the side of non-support, we are > + missing the legacy vcond{,u,eq} cases. Do this only when > + lowering will be able to fixup.. */ > + && !expand_vec_cond_expr_p (TREE_TYPE (@1), > + TREE_TYPE (@0), ERROR_MARK))) > (vec_cond @0 (op! @1 @3) (op! @2 @4)))) > > /* (c ? a : b) op d --> c ? (a op d) : (b op d) */ > @@ -5161,13 +5167,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (op (vec_cond:s @0 @1 @2) @3) > (if (TREE_CODE_CLASS (op) != tcc_comparison > || types_match (type, TREE_TYPE (@1)) > - || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) > + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK) > + || (optimize_vectors_before_lowering_p () > + && !expand_vec_cond_expr_p (TREE_TYPE (@1), > + TREE_TYPE (@0), ERROR_MARK))) > (vec_cond @0 (op! @1 @3) (op! @2 @3)))) > (simplify > (op @3 (vec_cond:s @0 @1 @2)) > (if (TREE_CODE_CLASS (op) != tcc_comparison > || types_match (type, TREE_TYPE (@1)) > - || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) > + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK) > + || (optimize_vectors_before_lowering_p () > + && !expand_vec_cond_expr_p (TREE_TYPE (@1), > + TREE_TYPE (@0), ERROR_MARK))) > (vec_cond @0 (op! @3 @1) (op! @3 @2))))) > > #if GIMPLE > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)