https://llvm.org/bugs/show_bug.cgi?id=26483
Bug ID: 26483 Summary: CUDA __device__ lambdas generating incorrect code Product: clang Version: unspecified Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P Component: -New Bugs Assignee: unassignedclangb...@nondot.org Reporter: justin.le...@gmail.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified The following CUDA code, which passes a __device__ functor to a kernel, compiles fine but does not give the correct output. When I compile and run it, I get the following output: Error 3: invalid device function Error: 0 2883584 Error: 1 0 Error: 2 2883584 Error: 3 0 Error: 4 3105024 Error: 5 0 Error: 6 0 Error: 7 0 Error: 8 0 Error: 9 0 Error: 10 0 Error: 11 0 Error: 12 0 Error: 13 0 Error: 14 0 Error: 15 0 The "error 3 invalid device function" is probably the root of the problem; it seems like we may not be running any device code at all. #include <cuda_runtime.h> #include <cuda.h> #include <cstdio> template<class Functor> __global__ void foo(const Functor f) { f(blockIdx.x*blockDim.x+threadIdx.x); } template<class Functor> void run(const Functor& f) { foo<Functor> <<<256,256>>> (f); } int main() { int* h_a = new int[256*256]; int d_c = 9; int* d_a; cudaMalloc(&d_a,256*256*sizeof(int)); run([=] (int i) __device__ { d_a[i] = d_c; }); cudaMemcpy(h_a,d_a,256*256*sizeof(int),cudaMemcpyDeviceToHost); printf("Error 3: %s\n",cudaGetErrorString(cudaGetLastError())); for(int i=0;i<16;i++) if(h_a[i]!=d_c) printf("Error: %i %i\n",i,h_a[i]); } -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs