On Thu, Oct 2, 2025 at 11:14 AM Andrew Stubbs <[email protected]> wrote:
>
> On 02/10/2025 10:58, Richard Biener wrote:
> > On Fri, Sep 26, 2025 at 1:27 PM Andrew Stubbs <[email protected]> wrote:
> >>
> >> When testcases (such as libgomp.c-c++-common/for-11.c) use unsigned types 
> >> for
> >> loop counters then the vectorizer will also use unsigned types for the 
> >> offsets
> >> passed to gather/scatter instructions.  This is a problem if the loop step 
> >> is
> >> negative and the offset type is narrower than a pointer.
> >
> > Hmm, but why did we chose an unsigned type in the first place then?  So,
> > isn't this an issue in vect_truncate_gather_scatter_offset or
> > vect_use_strided_gather_scatters_p?
>
> Here's the reproducer from PR121393:
>
> float a[100];
>
> int f()
> {
>    unsigned int i;                  <----- unsigned comes from here
>
> #pragma omp target simd
>    for (i=100; i != 0; i-=2)
>      a[i] = a[i] + 5.5;
> }
>
> int main()
> {
>    f();
>    return 0;
> }
>
>
> The patch is not needed if "i" is either signed or long long.

So given other targets/code-paths are fine it seems like the gimple IL is
mis-interpreted somehow.  Checking risc-v (also using stride load/store)
does

  _27 = .SELECT_VL (ivtmp_25, POLY_INT_CST [4, 4]);
  _8 = _27 * 18446744073709551608;
  vect__1.7_17 = .MASK_LEN_STRIDED_LOAD (vectp_a.5_14, -8, { 0.0, ...
}, { -1, ... }, _16(D), _27, 0);
  vect__2.8_19 = .COND_LEN_ADD ({ -1, ... }, vect__1.7_17, { 5.5e+0,
... }, _18(D), _27, 0);
  _1 = a[i_10];
  _2 = _1 + 5.5e+0;
  .MASK_LEN_STRIDED_STORE (vectp_a.9_22, -8, vect__2.8_19, { -1, ... }, _27, 0);
  i_7 = i_10 + 4294967294;

and in vect_use_strided_gather_scatters_p we end up with unsigned vector([4,4].

Your patch only changes how VEC_SERIES_EXPR is handled, you
make that signed.  It seems to me that GCN eventually cannot do
a VnSImode series but you always build a VnDImode one and fail to
truncate?  Or you claim you can handle VnSImode offset vectors for
gather/scatter but in reality require sign-/zero-extended VnDImode?
Note IIRC the gather/scatter optabs handle signed vs. unsigned
(aka zero vs. sign extension) via predicates.  See
internal_gather_scatter_fn_supported_p.

So with unsigned we expect the target to zero-extend.  Which might be
indeed an issue for negative step when we handle gather as
&a[100] + offset instead of &a[100 + index-offset].  The place you fix
is IMO still wrong.

For risc-v we end up using &a as base and we have (sizetype)i * 4 as offset,
so we shouldn't get to problematic values and instead have a series like
{396, 392, ... 0, -4U, ... } with the negative values being only operating on
masked lanes?

Is that what you see on GCN as well?

Richard.

> I have not observed issues in the other places, but I don't claim a full
> understanding of this code.
>
> Andrew
>
>
> >> This commit ensures that negative offsets are labelled as signed.
> >>
> >> gcc/ChangeLog:
> >>
> >>          PR target/121393
> >>          * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Use signed
> >>          offsets for negative steps.
> >> ---
> >>
> >> This fixes a problem observed on amdgcn, but there doesn't seem to be
> >> anything about it that is target-specific.
> >>
> >> OK for mainline?
> >>
> >> Andrew
> >>
> >>
> >>   gcc/tree-vect-stmts.cc | 8 ++++++++
> >>   1 file changed, 8 insertions(+)
> >>
> >> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> >> index cfc4f323a22..254fe6e6c0f 100644
> >> --- a/gcc/tree-vect-stmts.cc
> >> +++ b/gcc/tree-vect-stmts.cc
> >> @@ -2969,6 +2969,14 @@ vect_get_strided_load_store_ops (stmt_vec_info 
> >> stmt_info, slp_tree node,
> >>        type of the vector instead.  */
> >>     tree offset_type = TREE_TYPE (offset_vectype);
> >>
> >> +  if (TREE_CODE (DR_STEP (dr)) == INTEGER_CST
> >> +      && compare_step_with_zero (loop_vinfo, stmt_info) < 0
> >> +      && TYPE_SIGN (offset_type) == UNSIGNED)
> >> +    {
> >> +      offset_type = signed_type_for (offset_type);
> >> +      offset_vectype = signed_type_for (offset_vectype);
> >> +    }
> >> +
> >>     /* Calculate X = DR_STEP / SCALE and convert it to the appropriate 
> >> type.  */
> >>     tree step = size_binop (EXACT_DIV_EXPR, unshare_expr (DR_STEP (dr)),
> >>                            ssize_int (SLP_TREE_GS_SCALE (node)));
> >> --
> >> 2.51.0
> >>
>

Reply via email to