[PATCH] D122734: [CUDA][HIP] Fix mangling number for local struct

Yaxun Liu via Phabricator via cfe-commits Fri, 22 Apr 2022 20:26:36 -0700

yaxunl marked 7 inline comments as done.
yaxunl added inline comments.


================
Comment at: clang/include/clang/AST/ASTContext.h:682
+    /// Current name mangling is for device name in host compilation.
+    bool MangleDeviceNameInHostCompilation = false;
+  } CUDANameMangleCtx;
----------------
rnk wrote:
> It doesn't feel right to store mutable state here in ASTContext. You are 
> effectively using this as a global variable to thread a boolean through to 
> `ASTContext::getManglingNumber`, which is called from the mangler. The 
> boolean indicates of host or device mangling numbers are desired for this 
> particular mangling. Is that more or less correct?
> 
> This makes me think that it was perhaps a mistake to store mangling numbers 
> in the AST Context in the first place. If we stored them in the mangling 
> context instead, I think we'd get the right behavior out of the box. Is that 
> correct? It it feasible to make that change?
> It doesn't feel right to store mutable state here in ASTContext. You are 
> effectively using this as a global variable to thread a boolean through to 
> `ASTContext::getManglingNumber`, which is called from the mangler. The 
> boolean indicates of host or device mangling numbers are desired for this 
> particular mangling. Is that more or less correct?
> 
> This makes me think that it was perhaps a mistake to store mangling numbers 
> in the AST Context in the first place. If we stored them in the mangling 
> context instead, I think we'd get the right behavior out of the box. Is that 
> correct? It it feasible to make that change?

It is possible but I feel it is too invasive. Mangling numbers need to be 
calculated during construction of AST since it is context dependent. Mangle 
contexts are created in ABI of codegen. To store mangling numbers in mangling 
contexts, we need to move mangling contexts from ABI to ASTContext. We also 
need to add device mangling context in ASTContext. Then for each use of get/set 
mangling number API, we need to add get/set device mangling number API. That 
would need lots of changes.

I would see the state added to ASTContext by my approach from the perspective 
of single source language. The program contains both device code and host code. 
Ideally the compiler should be able to compile the device code with device 
target and ABI and compile the host code with host target ABI and switch the 
target freely during codegen. When we try to get the device side mangled name 
in host compilation, what we trying to do is switch to device target 
momoentorily during the mangling. This needs a state in ASTContext indicating 
that we are doing mangling for device instead of host, therefore 
getManglingNumber should return device mangling number instead of host mangling 
number.


================
Comment at: clang/include/clang/AST/ASTContext.h:684
+  } CUDANameMangleCtx;
+  struct CUDANameMangleContextRAII {
+    ASTContext &Ctx;
----------------
rnk wrote:
> By the way, this looks like `SaveAndRestore<bool>`. Maybe you can use that in 
> the future.
Yes. SaveAndRestore serves the purpose. will use it.


================
Comment at: clang/lib/AST/ASTContext.cpp:11760
+
+  auto Cutoff = [](unsigned V) { return V > 1 ? V : 1; };
+  if (CUDANameMangleCtx.MangleDeviceNameInHostCompilation)
----------------
tra wrote:
> rnk wrote:
> > rnk wrote:
> > > tra wrote:
> > > > I think we should change the `MangleNumbers` instead to use uin64_t or, 
> > > > perhaps `unsigned[2]` or even `struct{ unsigned host; unsigned 
> > > > device;}`.
> > > > Reducing the max number of anything to just 64K will be prone to 
> > > > failing sooner or later. 
> > > IMO it would be simpler to set up a ternary instead of a lambda and two 
> > > returns:
> > >   Res = Cond ? (Res >> 16) : (Res & 0xFFFF);
> > >   return Res > 1 ? Res : 1;
> > I suspect we could live with a 64K mangling number limit, but that is low 
> > enough that we need to put in a check for overflow somewhere.
> If we'd produce a reasonable diagnostic for that, it might work. 
> 
> 
will do


================
Comment at: clang/lib/AST/ASTContext.cpp:11760-11763
+  auto Cutoff = [](unsigned V) { return V > 1 ? V : 1; };
+  if (CUDANameMangleCtx.MangleDeviceNameInHostCompilation)
+    return Cutoff(Res >> 16);
+  return Cutoff(Res & 0xffff);
----------------
yaxunl wrote:
> tra wrote:
> > rnk wrote:
> > > rnk wrote:
> > > > tra wrote:
> > > > > I think we should change the `MangleNumbers` instead to use uin64_t 
> > > > > or, perhaps `unsigned[2]` or even `struct{ unsigned host; unsigned 
> > > > > device;}`.
> > > > > Reducing the max number of anything to just 64K will be prone to 
> > > > > failing sooner or later. 
> > > > IMO it would be simpler to set up a ternary instead of a lambda and two 
> > > > returns:
> > > >   Res = Cond ? (Res >> 16) : (Res & 0xFFFF);
> > > >   return Res > 1 ? Res : 1;
> > > I suspect we could live with a 64K mangling number limit, but that is low 
> > > enough that we need to put in a check for overflow somewhere.
> > If we'd produce a reasonable diagnostic for that, it might work. 
> > 
> > 
> will do
mangling number is member of some AST nodes. Increasing its size may increase 
overall AST memory usage and PCH size. I will add diagnostic for mangling 
number exceeding limit. If it happens in real applications I will revisit this.


================
Comment at: clang/test/CodeGenCUDA/struct-mangling-number.cu:25
+// CHECK: define amdgpu_kernel void 
@[[KERN:_Z6kernelIZN4TestIiE3runEvE2OpEvv]](
+// CHECK: @{{.*}} = {{.*}}c"[[KERN]]\00"
+
----------------
tra wrote:
> Using HOST/DEV prefixes would work better, IMO to tell what's expected to 
> happen where.
will do


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122734/new/

https://reviews.llvm.org/D122734

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122734: [CUDA][HIP] Fix mangling number for local struct

Reply via email to