On Mon, Nov 04, 2024 at 10:21:58AM +0000, Andrew Stubbs wrote:
> @@ -999,6 +1000,18 @@ omp_max_vf (void)
>         && OPTION_SET_P (flag_tree_loop_vectorize)))
>      return 1;
>  
> +  if (ENABLE_OFFLOADING && offload)
> +    {
> +      for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c;)
> +     {
> +       if (startswith (c, "amdgcn"))
> +         return 64;
> +       else if ((c = strchr (c, ':')))
> +         c++;
> +     }
> +      /* Otherwise, fall through to host VF.  */
> +    }

This assumes that the host can't have max_vf more than 64.
Because the offload code isn't compiled just for a single offload target,
but perhaps multiple and the host as well.
I think it should be max (64, omp_max_vf (false)) in that case.
Though SVE/RISC-V can complicate that by omp_max_vf returning a poly_uint64.
Maybe upper_bound is the function to call?

> --- a/gcc/omp-expand.cc
> +++ b/gcc/omp-expand.cc
> @@ -229,7 +229,15 @@ omp_adjust_chunk_size (tree chunk_size, bool 
> simd_schedule, bool offload)
>    if (!simd_schedule || integer_zerop (chunk_size))
>      return chunk_size;
>  
> -  poly_uint64 vf = omp_max_vf (offload);
> +  if (offload)
> +    {
> +      cfun->curr_properties &= ~PROP_gimple_lomp_dev;
> +      tree vf = build_call_expr_internal_loc (UNKNOWN_LOCATION, 
> IFN_GOMP_MAX_VF,
> +                                           unsigned_type_node, 0);
> +      return fold_convert (TREE_TYPE (chunk_size), vf);

This is incorrect.
The code below that doesn't return the vf returned by omp_max_vf, but
instead it returns (chunk_size + vf - 1) & -vf.

So, question is if we can rely omp_max_vf to always return a power of
two, clearly we rely on it now already in the existing code.

And, either it should build that
(tmp = IFN_GOMP_MAX_VF (), (chunk_size + tmp - 1) & -tmp)
expression as trees, or IFN_GOMP_MAX_VF should take an argument,
the chunk_size, and be folded to (chunk_size + vf - 1) & -vf
later.

        Jakub

Reply via email to