On Mon, Nov 04, 2024 at 10:21:58AM +0000, Andrew Stubbs wrote: > @@ -999,6 +1000,18 @@ omp_max_vf (void) > && OPTION_SET_P (flag_tree_loop_vectorize))) > return 1; > > + if (ENABLE_OFFLOADING && offload) > + { > + for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c;) > + { > + if (startswith (c, "amdgcn")) > + return 64; > + else if ((c = strchr (c, ':'))) > + c++; > + } > + /* Otherwise, fall through to host VF. */ > + }
This assumes that the host can't have max_vf more than 64. Because the offload code isn't compiled just for a single offload target, but perhaps multiple and the host as well. I think it should be max (64, omp_max_vf (false)) in that case. Though SVE/RISC-V can complicate that by omp_max_vf returning a poly_uint64. Maybe upper_bound is the function to call? > --- a/gcc/omp-expand.cc > +++ b/gcc/omp-expand.cc > @@ -229,7 +229,15 @@ omp_adjust_chunk_size (tree chunk_size, bool > simd_schedule, bool offload) > if (!simd_schedule || integer_zerop (chunk_size)) > return chunk_size; > > - poly_uint64 vf = omp_max_vf (offload); > + if (offload) > + { > + cfun->curr_properties &= ~PROP_gimple_lomp_dev; > + tree vf = build_call_expr_internal_loc (UNKNOWN_LOCATION, > IFN_GOMP_MAX_VF, > + unsigned_type_node, 0); > + return fold_convert (TREE_TYPE (chunk_size), vf); This is incorrect. The code below that doesn't return the vf returned by omp_max_vf, but instead it returns (chunk_size + vf - 1) & -vf. So, question is if we can rely omp_max_vf to always return a power of two, clearly we rely on it now already in the existing code. And, either it should build that (tmp = IFN_GOMP_MAX_VF (), (chunk_size + tmp - 1) & -tmp) expression as trees, or IFN_GOMP_MAX_VF should take an argument, the chunk_size, and be folded to (chunk_size + vf - 1) & -vf later. Jakub