jhuber6 wrote:

> From the discourse post and everything I've found reading about the SYCL 
> tooling, it seems to me like this should really just all be integrated into 
> LLD and performed with the linking phase. It seems like a huge waste of IO to 
> read objects, rip out device-specific bits, process that separately, then 
> read the objects again in another process to link the host bits, then in a 
> third process read the linked host and device bits and combine them...

Doing all of this in `lld` is something I floated around at the inception of 
the offload handling but decided it was more trouble than it was worth. You can 
think of this flow as a sort of linker plugin, where we preprocess a bunch of 
input files and then give a result back. In this case the result is an object 
with the embedded device code and the runtime calls necessary to register it. I 
didn't go this route for two reasons. First, it forces the user to change their 
host linker which people don't like. Second, we would need to go through great 
lengths to make the NVPTX target work because they don't use `ld.lld` and their 
toolchain is proprietary. i doubted it was going to be a very popular option to 
make `ld.lld` a frontend for `nvlink`, but we could go that route if so 
inclined, it's the same thing as another tool I wrote to work around NVIDIA's 
linker being subpar.

https://github.com/llvm/llvm-project/pull/112245
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to