tra added a comment. Enforcing explicit GPU target makes sense.
However, I think that singling out a `__global__` as the trigger is not sufficient for the intended purpose. If we can't generate a usable GPU-side binary, then we should produce an error if we need to generate *anything* during GPU-side compilation. Using `__global__` as a proxy would not catch some use cases and, possibly, will produce false positives in others. E.g. what if I have a TU which only has a `__device__ int var = 42;` combined with a host-side code to memcpy to/from it? It would still be a valid, if not very useful code, but it would still suffer from runtime being unable to load it on a GPU unless that variable is in a GPU binary compiled with a valid target. `__device__` functions in TUs compiled with `-fgpu-rdc` would have a similar problem. They would eventually be linked into a GPU binary which will be useless if it's not compiled for correct GPU. Granted, `__device__` functions will eventually need to be called from a kernel, so we will error out on a `__global__` *somewhere*, but it will miss the problem when such TU does not get to the linking stage (e.g. maybe the user wants to link them at runtime). ================ Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:8260 +def err_hip_kern_without_gpu : Error< + "compiling a HIP kernel without specifying an offload arch is not allowed">, + DefaultFatal; ---------------- How about compiling a file with `__device__` functions with `-fgpu-rdc`? If a kernel with no-arch is an error, then this should be an error, too. ================ Comment at: clang/lib/Sema/SemaDeclAttr.cpp:4431 } + if (S.getASTContext().getTargetInfo().getTargetOpts().CPU.empty() && + S.getLangOpts().HIP && S.getLangOpts().CUDAIsDevice) { ---------------- Will this fire if we have an uninstantiated kernel template? ================ Comment at: clang/test/SemaCUDA/kernel-no-gpu.cu:7 + +__global__ void kern1() {} +// hip-error@-1 {{compiling a HIP kernel without specifying an offload arch is not allowed}} ---------------- We'll need few more test cases. E.g. these should be fine. ``` template <typename T> __global__ void kernel(T arg ) {}; __global__ void kernel(T arg ); ``` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100552/new/ https://reviews.llvm.org/D100552 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits