[PATCH] D44747: Set calling convention for CUDA kernel

Artem Belevich via Phabricator via cfe-commits Tue, 03 Apr 2018 10:52:20 -0700

tra added inline comments.


================
Comment at: lib/Sema/SemaType.cpp:3319-3330
+  // Attribute AT_CUDAGlobal affects the calling convention for AMDGPU targets.
+  // This is the simplest place to infer calling convention for CUDA kernels.
+  if (S.getLangOpts().CUDA && S.getLangOpts().CUDAIsDevice) {
+    for (const AttributeList *Attr = D.getDeclSpec().getAttributes().getList();
+         Attr; Attr = Attr->getNext()) {
+      if (Attr->getKind() == AttributeList::AT_CUDAGlobal) {
+        CC = CC_CUDAKernel;
----------------
tra wrote:
> This apparently breaks compilation of some CUDA code in our internal tests. 
> I'm working on minimizing a reproduction case. Should this code be enabled 
> for AMD GPUs only?
Here's a small snippet of code that previously used to compile and work:

```
template <typename T>
__global__ void EmptyKernel(void) { }

struct Dummy {
  /// Type definition of the EmptyKernel kernel entry point
  typedef void (*EmptyKernelPtr)();
  EmptyKernelPtr Empty() { return EmptyKernel<void>; }
};
```
AFAICT,  it's currently impossible to apply __global__ to pointers, so there's 
no way to make the code above work with this patch applied.


Repository:
  rL LLVM

https://reviews.llvm.org/D44747



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D44747: Set calling convention for CUDA kernel

Reply via email to