On 10/20/22 08:07, Jakub Jelinek wrote:
Thus, IMHO it is exactly the pass_omp_simd_clone pass where you want to
implement this auto-simdization discovery, guarded with
#ifdef ACCEL_COMPILER and the new option (which means it will be done
only for gcn and not on the host right now).

I'm running into a practical difficulty with making this controlled by a static #ifdef: namely, testing.

One of my test cases examines the .s output to make sure that the clones are emitted as local symbols and not global. I have not been able to find the symbol linkage information in any of the dump files, and I have also not been able to figure out how to get a .s file from the offload compiler even outside of the DejaGnu test harness. (It's possible I am just an extreme dummy about the latter problem, but so far none of my colleagues here has been able to give me a recipe either.)

On top of that, I worry that this should be tested more broadly than for the one target we're presently focusing on (AMD GCN), and we'll get much more regular test coverage if it's also enabled for x86_64 target which has the necessary compute_vecsize_and_simdlen target hook.

I remember Carlos O'Donnell used to have a favorite mantra, "design for test". So, maybe generalize the new -fopenmp-target-simd-clone option to take a parameter to force clones to be generated on the OpenMP host for test purposes? The "declare target" directive already has a clause

device_type(host|nohost|any)

that defaults to "any"; maybe we could use that syntax like
-fopenmp-target-simd-clone=any
and use the intersection of the two sets to determine what to auto-generate clones for?

-Sandra

Reply via email to