On Wed, Nov 06, 2024 at 03:27:21PM +0000, Andrew Stubbs wrote: > Delay omp_max_vf call until after the host and device compilers have diverged > so that the max_vf value can be tuned exactly right on both variants. > > This change means that the ompdevlow pass must be enabled for functions that > use OpenMP directives with both "simd" and "schedule" enabled. > > gcc/ChangeLog: > > * internal-fn.cc (expand_GOMP_MAX_VF): New function. > * internal-fn.def (GOMP_MAX_VF): New internal function. > * omp-expand.cc (omp_adjust_chunk_size): Emit IFN_GOMP_MAX_VF when > called in offload context, otherwise assume host context. > * omp-offload.cc (execute_omp_device_lower): Expand IFN_GOMP_MAX_VF.
> + tree vf_minus_one = fold_build2 (MINUS_EXPR, type, vf, > + build_int_cst (type, 1)); > + tree negative_vf = fold_build1 (NEGATE_EXPR, type, vf); This could invoke UB if vf is LONG_MIN or similar, but I think at least now we can expect omp_max_vf to return reasonably small values whose negation is well defined even in signed types. > + chunk_size = fold_build2 (PLUS_EXPR, type, chunk_size, vf_minus_one); > + return fold_build2 (BIT_AND_EXPR, type, chunk_size, negative_vf); So ok. Jakub