> The new optabs need to be documented in doc/md.texi. Done.
> “Long” is a bit of an architecture-specific term. Maybe just: > > Try to find the following ABsolute Difference (ABD) or > widening ABD (WIDEN_ABD) pattern: Change made. > >> - VTYPE x, y, out; > >> + VTYPE x, y; > >> + WTYPE out; > >> type diff; > >> loop i in range: > >> S1 diff = x[i] - y[i] > >> S2 out[i] = ABS_EXPR <diff>; > >> > >> - where 'type' is a integer and 'VTYPE' is a vector of integers > >> - the same size as 'type' > >> + where 'VTYPE' and 'WTYPE' are vectors of integers. > >> + 'WTYPE' may be wider than 'VTYPE'. > >> + 'type' is as wide as 'WTYPE'. > > > > I don't think the existing comment is right about the types. What we're > > matching is scalar code, so VTYPE and (now) WTYPE are integers rather > > than vectors of integers. > > Gah, sorry, I realise now that the point was that VTYPE and WTYPE > are sequences rather than scalars. But patterns are used for SLP > as well as loops, and the inputs and outputs might not be memory > objects. So: > > > I think it would be clearer to write: > > > > S1 diff = (type) x[i] - (type) y[i] > > S2 out[i] = ABS_EXPR <(WTYPE) diff>; > > > > since the promotions happen on the operands. > > > > It'd be good to keep the part about 'type' being an integer. > > > > Rather than: > > > > 'WTYPE' may be wider than 'VTYPE'. > > 'type' is as wide as 'WTYPE'. > > > > maybe: > > > > 'type' is no narrower than 'VTYPE' (but may be wider) > > 'WTYPE' is no narrower than 'type' (but may be wider) > > ...how about: > > TYPE1 x; > TYPE2 y; > TYPE3 x_cast = (TYPE3) x; // widening or no-op > TYPE3 y_cast = (TYPE3) y; // widening or no-op > TYPE3 diff = x_cast - y_cast; > TYPE4 diff_cast = (TYPE4) diff; // widening or no-op > TYPE5 abs = ABS(U)_EXPR <diff_cast>; > > (based on the comment above vect_recog_widen_op_pattern). Done. > WTYPE can't be narrower than VTYPE though. I think with the changes > suggested above, the text before this block describes the conditions > in enough detail, and so we can just say: > > WIDEN_ABD exists to optimize the case where WTYPE is at least twice as > wide as VTYPE. Change made. > SABD_EXPR/UABD_EXPR should be IFN_ABD > SABDL_EXPR/UABDL_EXPR should be IFN_WIDEN_ABD Change made. > Maybe it would be easier to remove this comment, since I think the > comment above the function says enough. Done. > Rather than have the "extend" variable, how about: > > > > > - vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); > > + tree vectype_in = get_vectype_for_scalar_type (vinfo, abd_in_type); > > + tree vectype_out = get_vectype_for_scalar_type (vinfo, abd_out_type); > > + if (!vectype_in || !vectype_out) > > + return NULL; > > > > - if (!vectype > > - || !direct_internal_fn_supported_p (IFN_ABD, vectype, > > + if (ifn == IFN_VEC_WIDEN_ABD) > > + { > > + code_helper dummy_code; > > + int dummy_int; > > + auto_vec<tree> dummy_vec; > > + if (!supportable_widening_operation (vinfo, ifn, stmt_vinfo, > > + vectype_out, vectype_in, > > + &dummy_code, &dummy_code, > > + &dummy_int, &dummy_vec)) > > + { > > + /* There are architectures that have the ABD instructions > > + but not the ABDL instructions. If we just return NULL here > > + we will miss an occasion where we should have used ABD. > > + So we change back to ABD and try again. */ > > + ifn = IFN_ABD; > > + abd_out_type = abd_in_type; > > + extend = true; > > + } > > + } > > making this: > > if (TYPE_PRECISION (out_type) >= TYPE_PRECISION (abd_in_type) * 2) > { > tree mid_type > = build_nonstandard_integer_type (TYPE_PRECISION (abd_in_type) * 2, > TYPE_UNSIGNED (abd_in_type)); > tree mid_vectype = get_vectype_for_scalar_type (vinfo, mid_vectype); > code_helper dummy_code; > int dummy_int; > auto_vec<tree> dummy_vec; > if (mid_vectype > && supportable_widening_operation (vinfo, IFN_WIDEN_ABD, stmt_vinfo, > mid_vectype, vectype_in, > &dummy_code, &dummy_code, > &dummy_int, &dummy_vec)) > { > ifn = IFN_WIDEN_ABD; > abd_out_type = mid_type; > vectype_out = mid_vectype; > } > } > > The idea is to keep the assumption that we're using IFN_ABD > until we've proven conclusively otherwise. > > I think the later: > > if (!extend) > return abd_stmt; > > should then be deleted, since we should still use vect_convert_output > if abd_out_type has a different sign from out_type. > > ..... > > And this condition would then be: > > if (TYPE_PRECISION (abd_out_type) == TYPE_PRECISION (abd_in_type) > && TYPE_PRECISION (abd_out_type) < TYPE_PRECISION (out_type) > && !TYPE_UNSIGNED (abd_out_type)) > > since: > > (a) we don't need to force the type to unsigned if the final > vect_convert_output is just a sign change > > (b) we don't need to force the type to unsigned if abd_out_type > is large enough to hold the resu Done. > I think it'd be more natural to test this on TREE_TYPE (last_rhs) > rather than TREE_TYPE (abd_oprnd0), and do it after the assignment > to last_rhs above. Done. > > + tree abdl_result = vect_recog_temp_ssa_var (out_type, NULL); > > + gcall *abdl_stmt = gimple_build_call_internal (IFN_VEC_WIDEN_ABD, 2, > > + abd_oprnd0, abd_oprnd1); > > + gimple_call_set_lhs (abdl_stmt, abdl_result); > > + gimple_set_location (abdl_stmt, gimple_location (last_stmt)); > > + return abdl_stmt; > > I think “widen_abd” would be better than “abdl” here Done.