On Wed, Nov 06, 2024 at 03:27:21PM +0000, Andrew Stubbs wrote:
> Delay omp_max_vf call until after the host and device compilers have diverged
> so that the max_vf value can be tuned exactly right on both variants.
> 
> This change means that the ompdevlow pass must be enabled for functions that
> use OpenMP directives with both "simd" and "schedule" enabled.
> 
> gcc/ChangeLog:
> 
>       * internal-fn.cc (expand_GOMP_MAX_VF): New function.
>       * internal-fn.def (GOMP_MAX_VF): New internal function.
>       * omp-expand.cc (omp_adjust_chunk_size): Emit IFN_GOMP_MAX_VF when
>       called in offload context, otherwise assume host context.
>       * omp-offload.cc (execute_omp_device_lower): Expand IFN_GOMP_MAX_VF.

> +  tree vf_minus_one = fold_build2 (MINUS_EXPR, type, vf,
> +                                build_int_cst (type, 1));
> +  tree negative_vf = fold_build1 (NEGATE_EXPR, type, vf);

This could invoke UB if vf is LONG_MIN or similar, but I think at least now
we can expect omp_max_vf to return reasonably small values whose negation is
well defined even in signed types.

> +  chunk_size = fold_build2 (PLUS_EXPR, type, chunk_size, vf_minus_one);
> +  return fold_build2 (BIT_AND_EXPR, type, chunk_size, negative_vf);

So ok.

        Jakub

Reply via email to