tra added a comment. In D129536#3663957 <https://reviews.llvm.org/D129536#3663957>, @jdoerfert wrote:
> @tra, unsure about the crash. For me this passes fine (no gpu), is anything > missing? The tests in the patch are running with `-emit-llvm`, so they are not actually lowering to NVPTX and that's where the failure happens. https://godbolt.org/z/cchaWxrhn ================ Comment at: clang/lib/Headers/__clang_cuda_intrinsics.h:237-238 -inline __device__ unsigned int -__match64_any_sync(unsigned int mask, unsigned long long value) { +inline __device__ unsigned int __match64_any_sync(unsigned int mask, + unsigned long long value) { return __nvvm_match_any_sync_i64(mask, value); ---------------- Nit: this change is irrelevant to the patch and can be removed. ================ Comment at: clang/test/CodeGenCUDA/shuffle_long_long.cu:9 +#undef __CUDA_ARCH__ +#define __CUDA_ARCH__ 300 + ---------------- This macro should not be set. If you do need something to be compiled for sm_30, you should've specified via `-target-cpu sm_30`. ================ Comment at: clang/test/CodeGenCUDA/shuffle_long_long.cu:14 +#define warpSize 32 +#include "__clang_cuda_intrinsics.h" + ---------------- Nit: this should be `<...>` as we want the include to be found in compiler's include paths. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129536/new/ https://reviews.llvm.org/D129536 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits