On Tue, Aug 31, 2021 at 12:18 PM Hongtao Liu <crazy...@gmail.com> wrote: > > On Mon, Aug 30, 2021 at 8:25 PM Richard Biener via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Fri, Aug 27, 2021 at 8:53 AM liuhongt <hongtao....@intel.com> wrote: > > > > > > When gimple simplifcation try to combine op and vec_cond_expr to > > > cond_op, > > > it doesn't check if mask type matches. It causes an ICE when expand > > > cond_op > > > with mismatched mode. > > > This patch add a function named cond_vectorized_internal_fn_supported_p > > > to additionally check mask type than vectorized_internal_fn_supported_p. > > > > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > > Ok for trunk? > > > > > > gcc/ChangeLog: > > > > > > PR middle-end/102080 > > > * internal-fn.c (cond_vectorized_internal_fn_supported_p): New > > > functions. > > > * internal-fn.h (cond_vectorized_internal_fn_supported_p): New > > > declaration. > > > * match.pd: Check the type of mask while generating cond_op in > > > gimple simplication. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR middle-end/102080 > > > * gcc.target/i386/pr102080.c: New test. > > > --- > > > gcc/internal-fn.c | 22 ++++++++++++++++++++++ > > > gcc/internal-fn.h | 1 + > > > gcc/match.pd | 24 ++++++++++++++++-------- > > > gcc/testsuite/gcc.target/i386/pr102080.c | 16 ++++++++++++++++ > > > 4 files changed, 55 insertions(+), 8 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr102080.c > > > > > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > > > index 1360a00f0b9..8b2b65db1a7 100644 > > > --- a/gcc/internal-fn.c > > > +++ b/gcc/internal-fn.c > > > @@ -4102,6 +4102,28 @@ expand_internal_call (gcall *stmt) > > > expand_internal_call (gimple_call_internal_fn (stmt), stmt); > > > } > > > > > > +/* Check cond_op for vector modes since > > > vectorized_internal_fn_supported_p > > > + doesn't check if mask type matches. */ > > > +bool > > > +cond_vectorized_internal_fn_supported_p (internal_fn ifn, tree type, > > > + tree mask_type) > > > +{ > > > + if (!vectorized_internal_fn_supported_p (ifn, type)) > > > + return false; > > > + > > > + machine_mode mask_mode; > > > + machine_mode vmode = TYPE_MODE (type); > > > + int size1, size2; > > > + if (VECTOR_MODE_P (vmode) > > > + && targetm.vectorize.get_mask_mode (vmode).exists(&mask_mode) > > > + && GET_MODE_SIZE (mask_mode).is_constant (&size1) > > > + && GET_MODE_SIZE (TYPE_MODE (mask_type)).is_constant (&size2) > > > + && size1 != size2) > > > > Why do we check for equal size rather than just mode equality which > I originally thought TYPE_MODE of vector(8) <signed-boolean:1> was > not QImode, Changed the patch to check mode equality. > Update patch.
Looking at all this it seems the match.pd patterns should have not used vectorized_internal_fn_supported_p but direct_internal_fn_supported_p which is equivalent here because we're always working with vector modes? And then shouldn't we look at the actual optab whether the mask mode matches the expectation rather than going around via the target hook which may not have enough context to decide which mask mode to use? In any case if the approach of the patch is correct shouldn't it do if (VECTOR_MODE_P (vmode) && (!targetm.vectorize.get_mask_mode (vmode).exists(&mask_mode) || mask_mode != TYPE_MODE (mask_type))) return false; that is, not return true if there's no mask mode for the data mode? Given the first observation should we call the function direct_cond_internal_fn_supported_p () instead and as to the second observation, look at the optab operands mode? Richard. > > I think would work for non-constant sized modes as well? And when > > using sizes you'd instead use maybe_ne (GET_MODE_SIZE (mask_mode), > > GET_MODE_SIZE (TYPE_MODE (mask_type))) > > > > Thanks, > > Richard. > > > > > + return false; > > > + > > > + return true; > > > +} > > > + > > > /* If TYPE is a vector type, return true if IFN is a direct internal > > > function that is supported for that type. If TYPE is a scalar type, > > > return true if IFN is a direct internal function that is supported for > > > diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h > > > index 19d0f849a5a..f0aea00103c 100644 > > > --- a/gcc/internal-fn.h > > > +++ b/gcc/internal-fn.h > > > @@ -236,5 +236,6 @@ extern void expand_PHI (internal_fn, gcall *); > > > extern void expand_SHUFFLEVECTOR (internal_fn, gcall *); > > > > > > extern bool vectorized_internal_fn_supported_p (internal_fn, tree); > > > +extern bool cond_vectorized_internal_fn_supported_p (internal_fn, tree, > > > tree); > > > > > > #endif > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > index e5bbb123a6a..72b1bc674db 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -6987,14 +6987,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > cond_op (COND_BINARY) > > > (simplify > > > (vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3) > > > - (with { tree op_type = TREE_TYPE (@4); } > > > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > > > op_type) > > > + (with { tree op_type = TREE_TYPE (@4); > > > + tree mask_type = TREE_TYPE (@0); } > > > + (if (cond_vectorized_internal_fn_supported_p (as_internal_fn > > > (cond_op), > > > + op_type, mask_type) > > > && element_precision (type) == element_precision (op_type)) > > > (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3)))))) > > > (simplify > > > (vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3))) > > > - (with { tree op_type = TREE_TYPE (@4); } > > > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > > > op_type) > > > + (with { tree op_type = TREE_TYPE (@4); > > > + tree mask_type = TREE_TYPE (@0); } > > > + (if (cond_vectorized_internal_fn_supported_p (as_internal_fn > > > (cond_op), > > > + op_type, mask_type) > > > && element_precision (type) == element_precision (op_type)) > > > (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type > > > @1))))))) > > > > > > @@ -7003,14 +7007,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > cond_op (COND_TERNARY) > > > (simplify > > > (vec_cond @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4) > > > - (with { tree op_type = TREE_TYPE (@5); } > > > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > > > op_type) > > > + (with { tree op_type = TREE_TYPE (@5); > > > + tree mask_type = TREE_TYPE (@0); } > > > + (if (cond_vectorized_internal_fn_supported_p (as_internal_fn > > > (cond_op), > > > + op_type, mask_type) > > > && element_precision (type) == element_precision (op_type)) > > > (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4)))))) > > > (simplify > > > (vec_cond @0 @1 (view_convert? (uncond_op@5 @2 @3 @4))) > > > - (with { tree op_type = TREE_TYPE (@5); } > > > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > > > op_type) > > > + (with { tree op_type = TREE_TYPE (@5); > > > + tree mask_type = TREE_TYPE (@0); } > > > + (if (cond_vectorized_internal_fn_supported_p (as_internal_fn > > > (cond_op), > > > + op_type, mask_type) > > > && element_precision (type) == element_precision (op_type)) > > > (view_convert (cond_op (bit_not @0) @2 @3 @4 > > > (view_convert:op_type @1))))))) > > > diff --git a/gcc/testsuite/gcc.target/i386/pr102080.c > > > b/gcc/testsuite/gcc.target/i386/pr102080.c > > > new file mode 100644 > > > index 00000000000..6a40a75e1c5 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/pr102080.c > > > @@ -0,0 +1,16 @@ > > > +#include<immintrin.h> > > > +typedef float __m256 __attribute__((__vector_size__(32))); > > > +__m256 _mm256_blendv_ps___Y, _mm256_blendv_ps___M, _mm256_mul_ps___A, > > > + _mm256_mul_ps___B, IfThenElse___trans_tmp_9; > > > + > > > +void > > > +__attribute__ ((target("avx"))) > > > +IfThenElse (__m256 no) { > > > + IfThenElse___trans_tmp_9 = _mm256_blendv_ps (no, _mm256_blendv_ps___Y, > > > _mm256_blendv_ps___M); > > > +} > > > +void > > > +__attribute__ ((target("avx512vl"))) > > > +EncodedFromDisplay() { > > > + __m256 __trans_tmp_11 = _mm256_mul_ps___A * _mm256_mul_ps___B; > > > + IfThenElse(__trans_tmp_11); > > > +} > > > -- > > > 2.18.1 > > > > > > > -- > BR, > Hongtao