tra added a comment.

In D75811#1919368 <https://reviews.llvm.org/D75811#1919368>, @tambre wrote:

> After some work on my CMake changes, Clang detection as a CUDA compiler works 
> and I can compile CUDA code.


\o/ Nice! Having cmake supporting clang as a cuda compiler out of the box would 
be really nice.

> However code using separable compilation doesn't compile. What is the Clang 
> equivalent of NVCC's `-dc` (`--device-c`) option for this case?

Ah, `-rdc` compilation is somewhat tricky. NVCC does quite a bit of extra stuff 
under the hood that would be rather hard to implement in clang's driver, so it 
falls on the build system.
Clang will generate relocatable GPU code if you pass `-fcuda-rdc`, but that's 
only part of the story. Someone somewhere will need to perform the final 
linking step. There's also additional initialization glue to be handled.
Here's how it's implemented in bazel in Tensorflow: 
https://github.com/tensorflow/tensorflow/blob/ed371aa5d266222c799a7192e438cdd8c00464fe/third_party/nccl/build_defs.bzl.tpl
The file has fairly detailed description of what needs to be done.

> The CMake code review for CUDA Clang support is here 
> <https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4442>.

I'll take a look.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75811/new/

https://reviews.llvm.org/D75811



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to