On Wed, Sep 1, 2021 at 2:52 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Biener <richard.guent...@gmail.com> writes:
> > On Wed, Sep 1, 2021 at 8:28 AM Hongtao Liu <crazy...@gmail.com> wrote:
> >>
> >> On Tue, Aug 31, 2021 at 7:56 PM Richard Biener
> >> <richard.guent...@gmail.com> wrote:
> >> >
> >> > On Tue, Aug 31, 2021 at 12:18 PM Hongtao Liu <crazy...@gmail.com> wrote:
> >> > >
> >> > > On Mon, Aug 30, 2021 at 8:25 PM Richard Biener via Gcc-patches
> >> > > <gcc-patches@gcc.gnu.org> wrote:
> >> > > >
> >> > > > On Fri, Aug 27, 2021 at 8:53 AM liuhongt <hongtao....@intel.com> 
> >> > > > wrote:
> >> > > > >
> >> > > > >   When gimple simplifcation try to combine op and vec_cond_expr to 
> >> > > > > cond_op,
> >> > > > > it doesn't check if mask type matches. It causes an ICE when 
> >> > > > > expand cond_op
> >> > > > > with mismatched mode.
> >> > > > >   This patch add a function named 
> >> > > > > cond_vectorized_internal_fn_supported_p
> >> > > > >  to additionally check mask type than 
> >> > > > > vectorized_internal_fn_supported_p.
> >> > > > >
> >> > > > >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >> > > > >   Ok for trunk?
> >> > > > >
> >> > > > > gcc/ChangeLog:
> >> > > > >
> >> > > > >         PR middle-end/102080
> >> > > > >         * internal-fn.c (cond_vectorized_internal_fn_supported_p): 
> >> > > > > New functions.
> >> > > > >         * internal-fn.h (cond_vectorized_internal_fn_supported_p): 
> >> > > > > New declaration.
> >> > > > >         * match.pd: Check the type of mask while generating 
> >> > > > > cond_op in
> >> > > > >         gimple simplication.
> >> > > > >
> >> > > > > gcc/testsuite/ChangeLog:
> >> > > > >
> >> > > > >         PR middle-end/102080
> >> > > > >         * gcc.target/i386/pr102080.c: New test.
> >> > > > > ---
> >> > > > >  gcc/internal-fn.c                        | 22 
> >> > > > > ++++++++++++++++++++++
> >> > > > >  gcc/internal-fn.h                        |  1 +
> >> > > > >  gcc/match.pd                             | 24 
> >> > > > > ++++++++++++++++--------
> >> > > > >  gcc/testsuite/gcc.target/i386/pr102080.c | 16 ++++++++++++++++
> >> > > > >  4 files changed, 55 insertions(+), 8 deletions(-)
> >> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr102080.c
> >> > > > >
> >> > > > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> >> > > > > index 1360a00f0b9..8b2b65db1a7 100644
> >> > > > > --- a/gcc/internal-fn.c
> >> > > > > +++ b/gcc/internal-fn.c
> >> > > > > @@ -4102,6 +4102,28 @@ expand_internal_call (gcall *stmt)
> >> > > > >    expand_internal_call (gimple_call_internal_fn (stmt), stmt);
> >> > > > >  }
> >> > > > >
> >> > > > > +/* Check cond_op for vector modes since 
> >> > > > > vectorized_internal_fn_supported_p
> >> > > > > +   doesn't check if mask type matches.  */
> >> > > > > +bool
> >> > > > > +cond_vectorized_internal_fn_supported_p (internal_fn ifn, tree 
> >> > > > > type,
> >> > > > > +                                        tree mask_type)
> >> > > > > +{
> >> > > > > +  if (!vectorized_internal_fn_supported_p (ifn, type))
> >> > > > > +    return false;
> >> > > > > +
> >> > > > > +  machine_mode mask_mode;
> >> > > > > +  machine_mode vmode = TYPE_MODE (type);
> >> > > > > +  int size1, size2;
> >> > > > > +  if (VECTOR_MODE_P (vmode)
> >> > > > > +      && targetm.vectorize.get_mask_mode 
> >> > > > > (vmode).exists(&mask_mode)
> >> > > > > +      && GET_MODE_SIZE (mask_mode).is_constant (&size1)
> >> > > > > +      && GET_MODE_SIZE (TYPE_MODE (mask_type)).is_constant 
> >> > > > > (&size2)
> >> > > > > +      && size1 != size2)
> >> > > >
> >> > > > Why do we check for equal size rather than just mode equality which
> >> > > I originally thought  TYPE_MODE of vector(8) <signed-boolean:1> was
> >> > > not QImode, Changed the patch to check mode equality.
> >> > > Update patch.
> >> >
> >> > Looking at all this it seems the match.pd patterns should have not
> >> > used vectorized_internal_fn_supported_p but 
> >> > direct_internal_fn_supported_p
> >> > which is equivalent here because we're always working with vector modes?
>
> Yeah, looks like it.
>
> >> > And then shouldn't we look at the actual optab whether the mask mode 
> >> > matches
> >> > the expectation rather than going around via the target hook which may 
> >> > not have
> >> > enough context to decide which mask mode to use?
> >> How about this?
> >>
> >> +/* Return true if target supports cond_op with data TYPE and
> >> +   mask MASK_TYPE.  */
> >> +bool
> >> +cond_internal_fn_supported_p (internal_fn ifn, tree type,
> >> +       tree mask_type)
> >> +{
> >> +  tree_pair types = tree_pair (type, type);
> >> +  optab tmp = direct_internal_fn_optab (ifn, types);
> >> +  machine_mode vmode = TYPE_MODE (type);
> >> +  insn_code icode = direct_optab_handler (tmp, vmode);
> >> +  if (icode == CODE_FOR_nothing)
> >> +    return false;
> >> +
> >> +  machine_mode mask_mode = TYPE_MODE (mask_type);
> >> +  /* Can't create rtx and use insn_operand_matches here.  */
> >> +  return insn_data[icode].operand[0].mode == vmode
> >> +    && insn_data[icode].operand[1].mode == mask_mode;
> >> +}
> >> +
> >
> > Yeah, sth like that, though the operand[0].mode test should be
> > redudnant.  I think we should assert or have a whiltelist
> > for the internal function we support to be queried this way.
> > Not sure if we can directly access the 'cond_binary/cond_ternary'
> > classification used in internal-fn.def, that would be best.
> >
> > Richard, what are your thoughts about all this?
>
> IMO using get_mask_mode was right.  The optab documentation says:
>
>   Operands 0, 2, 3 and 4 all have mode @var{m}.  Operand 1 is a scalar
>   integer if @var{m} is scalar, otherwise it has the mode returned by
>   @code{TARGET_VECTORIZE_GET_MASK_MODE}.
>
> Allowing targets to use optabs to enforce different mask modes for
> different operations would open up a mess of combinations.

Meh!  And I thought this was the way out of my AVX512 vs AVX2 mask
mess with mask_gather_load ...

> In other words, I think cond_vectorized_internal_fn_supported_p
> is really testing two things:
>
> (1) is the mask type/vector type combination well-formed?

But can we always determine this without looking at the actual
operation?  That is, do we force targets to be consistent here
when writing their machine description and supporting more
than one mask variant?

> (2) is the internal function supported for the vector type?
>
> where (1) is a gimple question and (2) is a target question.
>
> I guess there's an argument that (1) should be part of the match.pd
> condition instead, alongside the element_precision check.  That would
> add to the cut-&-paste though. :-(
>
> Alternatively, I guess we would add:
>
>   bool is_truth_type_for (tree type, tree truth_type);
>
> to return true if truth_type is equal to truth_type_for (type)
> (but without having to call truth_type_for).  We could then use:
>
>   is_truth_type_for (op_type, TREE_TYPE (@0))
>
> instead of:
>
>   element_precision (type) == element_precision (op_type)
>
> since it should be a strictly stronger condition.

I guess that would work, yes.  Won't exactly help me with
mask_gather_load but for the cond_* IFNs it's certainly good.

Thanks,
Richard.

>
> Thanks,
> Richard

Reply via email to