[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

Shilei Tian via cfe-commits Sat, 03 May 2025 11:25:05 -0700

shiltian wrote:

I got it that you are trying to make it generic. That's why I didn’t roll back 
to using builtin bitcode as we did before. However there is one limitation that 
we can't really work around, which is the fact that we don't support ABI 
linking. This is not a new topic at all and whether to support it or how to do 
it is completely off topic here. Even if we support it, because of how 
expensive to set up function calls, it might still be arguably better to let 
the optimization pass to decide whether to import them or not.


The way ThinLTO works is, it compiles all input bitcode files in parallel after 
function import. The input here includes both user code, as well as the device 
runtime. We don't need to compile device runtime at all, since all uses of it 
have already import them to their own module. It is not that "bad" to compile 
it, as it will not cause any correctness issue. It is however a waste of time 
and will make the final code object a little bit larger. The issue here is, 
because we can’t properly lower local memory in non-kernel function at the 
moment, backend warning will emit some warnings, which is not ideal at all. 
After we internalize the entire device runtime, it will roughly become an empty 
module after optimization thus the backend could be happy and the confusing 
warning will not be emitted.

https://github.com/llvm/llvm-project/pull/138365
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

Reply via email to