ivanradanov wrote:

passing `-fopenmp --offload-arch=sm_80` above, so 
```
clang -fopenmp --offload-arch=sm_80 --verbose -foffload-via-llvm 
--cuda-path=/usr/local/cuda input.o  -o a.out
```

Gives us the appropriate flags. That means the cuda toolchain was created, 
correct?

I wonder if we need a step in clang that looks at all the .o files for sections 
that need device linking and concats the archs, and reinvokes itself with 
--offload-arch=<all_collected_arches> (although it is clang-linker-wrapper's 
job to do the parsing of the .o files for that so kind of weird to have clang 
do it) But then in theory the appropriate toolchains should be created. Perhaps 
it can only kick in when -foffload-via-llvm is on, but no --offload-archs are 
specified, i.e. we are asking clang to figure the appropriate offload archs.

That step could actually be handled by clang-offload-wrapper - you would get 

```
clang --offload-via-llvm <args>
  -> clang-linker-wrapper --detect-archs-and-exec=clang <args>
    -> clang --offload-via-llvm --offload-archs=<detected_archs> <args>
      -> clang-linker-wrapper (same as until now)
```

Pretty convoluted so I  don't know if it's appropriate
      

https://github.com/llvm/llvm-project/pull/149107
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to