> The new optabs need to be documented in doc/md.texi.
Done.
> “Long” is a bit of an architecture-specific term. Maybe just:
>
> Try to find the following ABsolute Difference (ABD) or
> widening ABD (WIDEN_ABD) pattern:
Change made.
> >> - VTYPE x, y, out;
> >> + VTYPE x, y;
> >> + WTYPE out;
> >> type diff;
> >> loop i in range:
> >> S1 diff = x[i] - y[i]
> >> S2 out[i] = ABS_EXPR <diff>;
> >>
> >> - where 'type' is a integer and 'VTYPE' is a vector of integers
> >> - the same size as 'type'
> >> + where 'VTYPE' and 'WTYPE' are vectors of integers.
> >> + 'WTYPE' may be wider than 'VTYPE'.
> >> + 'type' is as wide as 'WTYPE'.
> >
> > I don't think the existing comment is right about the types. What we're
> > matching is scalar code, so VTYPE and (now) WTYPE are integers rather
> > than vectors of integers.
>
> Gah, sorry, I realise now that the point was that VTYPE and WTYPE
> are sequences rather than scalars. But patterns are used for SLP
> as well as loops, and the inputs and outputs might not be memory
> objects. So:
>
> > I think it would be clearer to write:
> >
> > S1 diff = (type) x[i] - (type) y[i]
> > S2 out[i] = ABS_EXPR <(WTYPE) diff>;
> >
> > since the promotions happen on the operands.
> >
> > It'd be good to keep the part about 'type' being an integer.
> >
> > Rather than:
> >
> > 'WTYPE' may be wider than 'VTYPE'.
> > 'type' is as wide as 'WTYPE'.
> >
> > maybe:
> >
> > 'type' is no narrower than 'VTYPE' (but may be wider)
> > 'WTYPE' is no narrower than 'type' (but may be wider)
>
> ...how about:
>
> TYPE1 x;
> TYPE2 y;
> TYPE3 x_cast = (TYPE3) x; // widening or no-op
> TYPE3 y_cast = (TYPE3) y; // widening or no-op
> TYPE3 diff = x_cast - y_cast;
> TYPE4 diff_cast = (TYPE4) diff; // widening or no-op
> TYPE5 abs = ABS(U)_EXPR <diff_cast>;
>
> (based on the comment above vect_recog_widen_op_pattern).
Done.
> WTYPE can't be narrower than VTYPE though. I think with the changes
> suggested above, the text before this block describes the conditions
> in enough detail, and so we can just say:
>
> WIDEN_ABD exists to optimize the case where WTYPE is at least twice as
> wide as VTYPE.
Change made.
> SABD_EXPR/UABD_EXPR should be IFN_ABD
> SABDL_EXPR/UABDL_EXPR should be IFN_WIDEN_ABD
Change made.
> Maybe it would be easier to remove this comment, since I think the
> comment above the function says enough.
Done.
> Rather than have the "extend" variable, how about:
>
> >
> > - vect_pattern_detected ("vect_recog_abd_pattern", last_stmt);
> > + tree vectype_in = get_vectype_for_scalar_type (vinfo, abd_in_type);
> > + tree vectype_out = get_vectype_for_scalar_type (vinfo, abd_out_type);
> > + if (!vectype_in || !vectype_out)
> > + return NULL;
> >
> > - if (!vectype
> > - || !direct_internal_fn_supported_p (IFN_ABD, vectype,
> > + if (ifn == IFN_VEC_WIDEN_ABD)
> > + {
> > + code_helper dummy_code;
> > + int dummy_int;
> > + auto_vec<tree> dummy_vec;
> > + if (!supportable_widening_operation (vinfo, ifn, stmt_vinfo,
> > + vectype_out, vectype_in,
> > + &dummy_code, &dummy_code,
> > + &dummy_int, &dummy_vec))
> > + {
> > + /* There are architectures that have the ABD instructions
> > + but not the ABDL instructions. If we just return NULL here
> > + we will miss an occasion where we should have used ABD.
> > + So we change back to ABD and try again. */
> > + ifn = IFN_ABD;
> > + abd_out_type = abd_in_type;
> > + extend = true;
> > + }
> > + }
>
> making this:
>
> if (TYPE_PRECISION (out_type) >= TYPE_PRECISION (abd_in_type) * 2)
> {
> tree mid_type
> = build_nonstandard_integer_type (TYPE_PRECISION (abd_in_type) * 2,
> TYPE_UNSIGNED (abd_in_type));
> tree mid_vectype = get_vectype_for_scalar_type (vinfo, mid_vectype);
> code_helper dummy_code;
> int dummy_int;
> auto_vec<tree> dummy_vec;
> if (mid_vectype
> && supportable_widening_operation (vinfo, IFN_WIDEN_ABD, stmt_vinfo,
> mid_vectype, vectype_in,
> &dummy_code, &dummy_code,
> &dummy_int, &dummy_vec))
> {
> ifn = IFN_WIDEN_ABD;
> abd_out_type = mid_type;
> vectype_out = mid_vectype;
> }
> }
>
> The idea is to keep the assumption that we're using IFN_ABD
> until we've proven conclusively otherwise.
>
> I think the later:
>
> if (!extend)
> return abd_stmt;
>
> should then be deleted, since we should still use vect_convert_output
> if abd_out_type has a different sign from out_type.
>
> .....
>
> And this condition would then be:
>
> if (TYPE_PRECISION (abd_out_type) == TYPE_PRECISION (abd_in_type)
> && TYPE_PRECISION (abd_out_type) < TYPE_PRECISION (out_type)
> && !TYPE_UNSIGNED (abd_out_type))
>
> since:
>
> (a) we don't need to force the type to unsigned if the final
> vect_convert_output is just a sign change
>
> (b) we don't need to force the type to unsigned if abd_out_type
> is large enough to hold the resu
Done.
> I think it'd be more natural to test this on TREE_TYPE (last_rhs)
> rather than TREE_TYPE (abd_oprnd0), and do it after the assignment
> to last_rhs above.
Done.
> > + tree abdl_result = vect_recog_temp_ssa_var (out_type, NULL);
> > + gcall *abdl_stmt = gimple_build_call_internal (IFN_VEC_WIDEN_ABD, 2,
> > + abd_oprnd0, abd_oprnd1);
> > + gimple_call_set_lhs (abdl_stmt, abdl_result);
> > + gimple_set_location (abdl_stmt, gimple_location (last_stmt));
> > + return abdl_stmt;
>
> I think “widen_abd” would be better than “abdl” here
Done.