> The new optabs need to be documented in doc/md.texi.

Done.

> “Long” is a bit of an architecture-specific term.  Maybe just:
> 
>    Try to find the following ABsolute Difference (ABD) or
>    widening ABD (WIDEN_ABD) pattern:

Change made.

> >> -     VTYPE x, y, out;
> >> +     VTYPE x, y;
> >> +     WTYPE out;
> >>       type diff;
> >>     loop i in range:
> >>       S1 diff = x[i] - y[i]
> >>       S2 out[i] = ABS_EXPR <diff>;
> >>  
> >> -   where 'type' is a integer and 'VTYPE' is a vector of integers
> >> -   the same size as 'type'
> >> +   where 'VTYPE' and 'WTYPE' are vectors of integers.
> >> +   'WTYPE' may be wider than 'VTYPE'.
> >> +   'type' is as wide as 'WTYPE'.
> >
> > I don't think the existing comment is right about the types.  What we're
> > matching is scalar code, so VTYPE and (now) WTYPE are integers rather
> > than vectors of integers.
> 
> Gah, sorry, I realise now that the point was that VTYPE and WTYPE
> are sequences rather than scalars.  But patterns are used for SLP
> as well as loops, and the inputs and outputs might not be memory
> objects.  So:
> 
> > I think it would be clearer to write:
> >
> >        S1 diff = (type) x[i] - (type) y[i]
> >        S2 out[i] = ABS_EXPR <(WTYPE) diff>;
> >
> > since the promotions happen on the operands.
> >
> > It'd be good to keep the part about 'type' being an integer.
> >
> > Rather than:
> >
> >      'WTYPE' may be wider than 'VTYPE'.
> >      'type' is as wide as 'WTYPE'.
> >
> > maybe:
> >
> >      'type' is no narrower than 'VTYPE' (but may be wider)
> >      'WTYPE' is no narrower than 'type' (but may be wider)
> 
> ...how about:
> 
>   TYPE1 x;
>   TYPE2 y;
>   TYPE3 x_cast = (TYPE3) x;              // widening or no-op
>   TYPE3 y_cast = (TYPE3) y;              // widening or no-op
>   TYPE3 diff = x_cast - y_cast;
>   TYPE4 diff_cast = (TYPE4) diff;        // widening or no-op
>   TYPE5 abs = ABS(U)_EXPR <diff_cast>;
> 
> (based on the comment above vect_recog_widen_op_pattern).

Done.

> WTYPE can't be narrower than VTYPE though.  I think with the changes
> suggested above, the text before this block describes the conditions
> in enough detail, and so we can just say:
> 
>    WIDEN_ABD exists to optimize the case where WTYPE is at least twice as
>    wide as VTYPE.

Change made.

> SABD_EXPR/UABD_EXPR should be IFN_ABD
> SABDL_EXPR/UABDL_EXPR should be IFN_WIDEN_ABD

Change made.

> Maybe it would be easier to remove this comment, since I think the
> comment above the function says enough.

Done.

> Rather than have the "extend" variable, how about:
> 
> >  
> > -  vect_pattern_detected ("vect_recog_abd_pattern", last_stmt);
> > +  tree vectype_in = get_vectype_for_scalar_type (vinfo, abd_in_type);
> > +  tree vectype_out = get_vectype_for_scalar_type (vinfo, abd_out_type);
> > +  if (!vectype_in || !vectype_out)
> > +    return NULL;
> >  
> > -  if (!vectype
> > -      || !direct_internal_fn_supported_p (IFN_ABD, vectype,
> > +  if (ifn == IFN_VEC_WIDEN_ABD)
> > +    {
> > +      code_helper dummy_code;
> > +      int dummy_int;
> > +      auto_vec<tree> dummy_vec;
> > +      if (!supportable_widening_operation (vinfo, ifn, stmt_vinfo,
> > +                                      vectype_out, vectype_in,
> > +                                      &dummy_code, &dummy_code,
> > +                                      &dummy_int, &dummy_vec))
> > +   {
> > +     /* There are architectures that have the ABD instructions
> > +        but not the ABDL instructions.  If we just return NULL here
> > +        we will miss an occasion where we should have used ABD.
> > +        So we change back to ABD and try again.  */
> > +     ifn = IFN_ABD;
> > +     abd_out_type = abd_in_type;
> > +     extend = true;
> > +   }
> > +    }
> 
> making this:
> 
>   if (TYPE_PRECISION (out_type) >= TYPE_PRECISION (abd_in_type) * 2)
>     {
>       tree mid_type
>       = build_nonstandard_integer_type (TYPE_PRECISION (abd_in_type) * 2,
>                                         TYPE_UNSIGNED (abd_in_type));
>       tree mid_vectype = get_vectype_for_scalar_type (vinfo, mid_vectype);
>       code_helper dummy_code;
>       int dummy_int;
>       auto_vec<tree> dummy_vec;
>       if (mid_vectype
>         && supportable_widening_operation (vinfo, IFN_WIDEN_ABD, stmt_vinfo,
>                                            mid_vectype, vectype_in,
>                                            &dummy_code, &dummy_code,
>                                            &dummy_int, &dummy_vec))
>       {
>         ifn = IFN_WIDEN_ABD;
>         abd_out_type = mid_type;
>         vectype_out = mid_vectype;
>       }
>     }
> 
> The idea is to keep the assumption that we're using IFN_ABD
> until we've proven conclusively otherwise.
> 
> I think the later:
> 
>   if (!extend)
>     return abd_stmt;
> 
> should then be deleted, since we should still use vect_convert_output
> if abd_out_type has a different sign from out_type.
> 
> .....
> 
> And this condition would then be:
> 
>   if (TYPE_PRECISION (abd_out_type) == TYPE_PRECISION (abd_in_type)
>       && TYPE_PRECISION (abd_out_type) < TYPE_PRECISION (out_type)
>       && !TYPE_UNSIGNED (abd_out_type))
> 
> since:
> 
> (a) we don't need to force the type to unsigned if the final
>     vect_convert_output is just a sign change
> 
> (b) we don't need to force the type to unsigned if abd_out_type
>     is large enough to hold the resu

Done.

> I think it'd be more natural to test this on TREE_TYPE (last_rhs)
> rather than TREE_TYPE (abd_oprnd0), and do it after the assignment
> to last_rhs above.

Done.

> > +  tree abdl_result = vect_recog_temp_ssa_var (out_type, NULL);
> > +  gcall *abdl_stmt = gimple_build_call_internal (IFN_VEC_WIDEN_ABD, 2,
> > +                                            abd_oprnd0, abd_oprnd1);
> > +  gimple_call_set_lhs (abdl_stmt, abdl_result);
> > +  gimple_set_location (abdl_stmt, gimple_location (last_stmt));
> > +  return abdl_stmt;
> 
> I think “widen_abd” would be better than “abdl” here

Done.

Reply via email to