Issue |
120836
|
Summary |
[Clang] [NVPTX] [Windows] Compilation errors with CUDA NVPTX backend and MSVC headers
|
Labels |
clang
|
Assignees |
|
Reporter |
blinkfrog
|
**Describe the bug**
When compiling CUDA code with Clang's NVPTX backend on Windows using MSVC headers, multiple errors occur related to the `__builtin_va_list` type not being compatible with MSVC's `va_list`. This appears to be a mismatch in how Clang and MSVC handle `va_list` when compiling CUDA device code.
**To Reproduce**
Steps to reproduce the behavior:
1. Use the following simple CUDA program (`test.cu`):
```
#include <iostream>
#include <cuda_runtime.h>
__global__ void addKernel(int *c, const int *a, const int *b) {
int i = threadIdx.x;
c[i] = a[i] + b[i];
}
int main() {
const int arraySize = 5;
int a[arraySize] = {1, 2, 3, 4, 5};
int b[arraySize] = {10, 20, 30, 40, 50};
int c[arraySize] = {0};
int *dev_a = nullptr, *dev_b = nullptr, *dev_c = nullptr;
cudaMalloc((void**)&dev_a, arraySize * sizeof(int));
cudaMalloc((void**)&dev_b, arraySize * sizeof(int));
cudaMalloc((void**)&dev_c, arraySize * sizeof(int));
cudaMemcpy(dev_a, a, arraySize * sizeof(int), cudaMemcpyHostToDevice);
cudaMemcpy(dev_b, b, arraySize * sizeof(int), cudaMemcpyHostToDevice);
addKernel<<<1, arraySize>>>(dev_c, dev_a, dev_b);
cudaMemcpy(c, dev_c, arraySize * sizeof(int), cudaMemcpyDeviceToHost);
std::cout << "Results: ";
for (int i = 0; i < arraySize; ++i) {
std::cout << c[i] << " ";
}
std::cout << std::endl;
cudaFree(dev_a);
cudaFree(dev_b);
cudaFree(dev_c);
return 0;
}
```
2. Compile the code using Clang with the following command (adjust paths according to your environment):
```
clang++ -std=c++14 --cuda-gpu-arch=sm_75 --cuda-path="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4" -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\lib\x64" -lcudart_static -ldl -lrt -pthread test.cu -o test.exe
```
3. Observe the following errors (truncated):
```
In file included from <built-in>:1:
In file included from C:\llvm\lib\clang\19\include\__clang_cuda_runtime_wrapper.h:472:
In file included from C:\llvm\lib\clang\19\include\__clang_cuda_cmath.h:16:
In file included from C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include\limits:12:
In file included from C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include\cwchar:11:
In file included from C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include\cstdio:11:
In file included from C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt\stdio.h:13:
C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt\corecrt_wstdio.h:486:24: error: non-const lvalue
reference to type '__builtin_va_list' cannot bind to a value of unrelated type 'va_list' (aka 'char *')
486 | __crt_va_start(_ArgList, _Locale);
| ^~~~~~~~
C:\llvm\lib\clang\19\include\vadefs.h:39:54: note: expanded from macro '__crt_va_start'
39 | #define __crt_va_start(ap, param) __builtin_va_start(ap, param)
| ^~
In file included from <built-in>:1:
In file included from C:\llvm\lib\clang\19\include\__clang_cuda_runtime_wrapper.h:472:
In file included from C:\llvm\lib\clang\19\include\__clang_cuda_cmath.h:16:
In file included from C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include\limits:12:
In file included from C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include\cwchar:11:
In file included from C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\include\cstdio:11:
In file included from C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt\stdio.h:13:
C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt\corecrt_wstdio.h:488:22: error: non-const lvalue
reference to type '__builtin_va_list' cannot bind to a value of unrelated type 'va_list' (aka 'char *')
488 | __crt_va_end(_ArgList);
| ^~~~~~~~
(truncated)
```
**Expected behavior**
The program should compile without errors related to standard library headers.
**Observed behavior**
Compilation fails with the above `va_list` errors, indicating a mismatch between Clang’s built-in types and MSVC’s definitions in CUDA device mode.
**Environment:**
LLVM Clang version: 19.1.6
CUDA toolkit version: 12.4
Visual Studio version: 2022 Community Edition 17.12.3
Windows SDK version: 10.0.22621.0
Command used for compilation:
```
clang++ -std=c++14 --cuda-gpu-arch=sm_75 --cuda-path="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4" -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\lib\x64" -lcudart_static -ldl -lrt -pthread test.cu -o test.exe
```
**Additional context**
These errors suggest a compatibility issue between Clang’s NVPTX backend and the MSVC headers when used in device compilation. This blocks our development with Clang + CUDA on Windows. Any guidance or fixes would be greatly appreciated.
Thank you for looking into this issue.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs