https://llvm.org/bugs/show_bug.cgi?id=27276

            Bug ID: 27276
           Summary: global functions in anonymous namespace cause runtime
                    cuda error
           Product: clang
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: CUDA
          Assignee: unassignedclangb...@nondot.org
          Reporter: crtr...@sandia.gov
                CC: llvm-bugs@lists.llvm.org
    Classification: Unclassified

Created attachment 16185
  --> https://llvm.org/bugs/attachment.cgi?id=16185&action=edit
Reproducer

When defining a global function in anonymous namespace, it causes a "invalid
device function" error if the function is called from the same compilation
unit. 

i.e. in simplified form this doesn't work: 
namespace {
__global__ void foo() {}

void bar() {
   foo<<<1,1>>>();
}
}

Attached is a reproducer. The output with NVCC is:
////
[crtrott@apollo test_kernel]$ ./a.out 
Error PreKernel: no error
Error PostKernel: no error
Error PostSync: no error
Arch: 350
////

The output with clang is:
////
[crtrott@apollo test_kernel]$ ./a.out 
Error PreKernel: no error
Error PostKernel: invalid device function
Error PostSync: no error
Arch: 9
////

Compile with ./build_clang after adjusting the paths in the build script. 
For a workaround (i.e. remove the namespace) compile with 
./build_clang -DWORKAROUND

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to