[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-10 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG80072fde61d4: [CUDA][HIP] Allow comdat for kernels (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANG

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-10 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment. I think the key is the self-reference in the LEA instruction: > ; foo > .seh_proc "??$foo@H@@YAXH@Z" > ... > leaq"??$foo@H@@YAXH@Z"(%rip), %rcx > ... > ; foo > .seh_proc "??$foo@M@@YAXM@Z" > ... > leaq"??$foo@M@@YAXM@Z"(%rip), %rcx >

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I did an experiment regarding the ICF issue and it seems not to affect kernel stub. #include "hip/hip_runtime.h" template void bar(T x) { } template __global__ void foo(T x) {} int main() { foo<<<1,1>>>(1); printf("%p\n", foo); printf("%p\

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4290-4293 - // Do not set COMDAT attribute for CUDA/HIP stub functions to prevent - // them being "merged" by the COMDAT Folding linker optimization. - if (D.hasAttr()) -return false; ---

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a comment. In D112492#3119892 , @tra wrote: > Yes, we do need to merge identical functions with **identical names** for > templates. > > The comdat-folding issue is different. IIUIC, it allows merging two functions > with identical code and **

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Yes, we do need to merge identical functions with **identical names** for templates. The comdat-folding issue is different. IIUIC, it allows merging two functions with identical code and **different names**, into one function with two names. That will break CUDA as we do n

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I think probably it is necessary to merge linkonce_odr symbols for them to work properly. Consider the following testcase: // a.cu template __global__ void foo(T x) {} void test1() { foo<<<1,1>>>(1); } // b.cu template __global__ void foo(T x)

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4290-4293 - // Do not set COMDAT attribute for CUDA/HIP stub functions to prevent - // them being "merged" by the COMDAT Folding linker optimization. - if (D.hasAttr()) -return false; --

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Reid Kleckner via Phabricator via cfe-commits
rnk added a subscriber: zequanwu. rnk added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4290-4293 - // Do not set COMDAT attribute for CUDA/HIP stub functions to prevent - // them being "merged" by the COMDAT Folding linker optimization. - if (D.hasAttr())

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added subscribers: kpyzhov, rnk. tra added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4290-4293 - // Do not set COMDAT attribute for CUDA/HIP stub functions to prevent - // them being "merged" by the COMDAT Folding linker optimization. - if (D.hasAttr

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

2021-11-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 385812. yaxunl retitled this revision from "[HIP] Do not use kernel handle for MSVC target" to "[CUDA][HIP] Allow comdat for kernels". yaxunl edited the summary of this revision. yaxunl added a comment. fix comdat instead CHANGES SINCE LAST ACTION https://