On Tue, Jul 12, 2022 at 03:16:35PM +0100, Andrew Stubbs wrote:
> --- a/gcc/gimple-loop-versioning.cc
> +++ b/gcc/gimple-loop-versioning.cc
> @@ -555,7 +555,10 @@ loop_versioning::loop_versioning (function *fn)
>       unvectorizable code, since it is the largest size that can be
>       handled efficiently by scalar code.  omp_max_vf calculates the
>       maximum number of bytes in a vector, when such a value is relevant
> -     to loop optimization.  */
> +     to loop optimization.
> +     FIXME: this probably needs to use omp_max_simd_vf when in a target
> +     region, but how to tell? (And MAX_FIXED_MODE_SIZE is large enough that
> +     it doesn't actually matter.)  */
>    m_maximum_scale = estimated_poly_value (omp_max_vf ());
>    m_maximum_scale = MAX (m_maximum_scale, MAX_FIXED_MODE_SIZE);

I think this shouldn't have the comment added, the use here actually isn't
much OpenMP related, it just uses the function because it implements
what it wants.

> --- a/gcc/omp-general.cc
> +++ b/gcc/omp-general.cc
> @@ -994,6 +994,24 @@ omp_max_simt_vf (void)
>    return 0;
>  }
>  
> +/* Return maximum SIMD width if offloading may target SIMD hardware.  */
> +
> +int
> +omp_max_simd_vf (void)

The name is just confusing.
omp_max_vf is about the SIMD maximum VF, so if you really want,
rename omp_max_vf to omp_max_simd_vf.

For the offloading related stuff, IMHO either we put it into
that single omp-general.cc function and add a bool argument to it
whether it is or might be in offloading region (ordered_maximum
from the returned value and the offloading one, but only after the
initialy return 1; conditions and adjust callers), or have this
separate function, but then IMHO the if (!optimize) return 0;
initial test should be
  if (!optimize
      || optimize_debug
      || !flag_tree_loop_optimize
      || (!flag_tree_loop_vectorize
          && OPTION_SET_P (flag_tree_loop_vectorize)))
    return 1;
because without that nothing is vectorized, on host nor on offloading
targets, and the function should be called omp_max_target_vf or
omp_max_target_simd_vf.

> +{
> +  if (!optimize)
> +    return 0;

> --- a/gcc/omp-low.cc
> +++ b/gcc/omp-low.cc
> @@ -4646,7 +4646,14 @@ lower_rec_simd_input_clauses (tree new_var, 
> omp_context *ctx,
>  {
>    if (known_eq (sctx->max_vf, 0U))
>      {
> -      sctx->max_vf = sctx->is_simt ? omp_max_simt_vf () : omp_max_vf ();
> +      /* If we are compiling for multiple devices choose the largest VF.  */
> +      sctx->max_vf = omp_max_vf ();
> +      if (omp_maybe_offloaded_ctx (ctx))
> +     {
> +       if (sctx->is_simt)
> +         sctx->max_vf = ordered_max (sctx->max_vf, omp_max_simt_vf ());
> +       sctx->max_vf = ordered_max (sctx->max_vf, omp_max_simd_vf ());
> +     }

This is wrong.  If sctx->is_simt, we know it is the SIMT version.
So we want to use omp_max_simt_vf (), not maximum of that and something
unrelated.
Only if !sctx->is_simt, we want to use maximum of omp_max_vf and if
omp_maybe_offloaded_ctx also omp_max_target_vf or how it is called (or
pass that as argument to omp_max_vf).

We have another omp_max_vf () call though, in
omp-expand.cc (omp_adjust_chunk_size).
That is for schedule (simd: dynamic, 32)
and similar, though unlike the omp-low.cc case (where using a larger VF
in that case doesn't hurt, it is used for sizing of the maxic arrays that
are afterwards resized to the actual size), using too large values in
that case is harmful.
So dunno if it should take into account offloading vf or not.
Maybe if maybe offloading maybe not it should fold to some internal fn call
dependent expression that folds to omp_max_vf of the actual target after
IPA.

        Jakub

Reply via email to