[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/129927 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/129927 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Should drop the nvvm reflect here. Really shouldn't have any subtarget dependent code here. Injecting implementation details into the source program is part of the fundamental issue with device lib linking https://github.com/llvm/llvm-project/pull/129927 _

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-05 Thread Joseph Huber via cfe-commits
@@ -179,8 +179,10 @@ __gpu_shuffle_idx_u64(uint64_t __lane_mask, uint32_t __idx, uint64_t __x, _DEFAULT_FN_ATTRS static __inline__ uint64_t __gpu_match_any_u32(uint64_t __lane_mask, uint32_t __x) { // Newer targets can use the dedicated CUDA support. - if (__CUDA_ARCH__ >=

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-05 Thread Artem Belevich via cfe-commits
@@ -179,8 +179,10 @@ __gpu_shuffle_idx_u64(uint64_t __lane_mask, uint32_t __idx, uint64_t __x, _DEFAULT_FN_ATTRS static __inline__ uint64_t __gpu_match_any_u32(uint64_t __lane_mask, uint32_t __x) { // Newer targets can use the dedicated CUDA support. - if (__CUDA_ARCH__ >=

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-05 Thread Artem Belevich via cfe-commits
@@ -179,8 +179,10 @@ __gpu_shuffle_idx_u64(uint64_t __lane_mask, uint32_t __idx, uint64_t __x, _DEFAULT_FN_ATTRS static __inline__ uint64_t __gpu_match_any_u32(uint64_t __lane_mask, uint32_t __x) { // Newer targets can use the dedicated CUDA support. - if (__CUDA_ARCH__ >=

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-05 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-x86 Author: Joseph Huber (jhuber6) Changes Summary: These require `+ptx` features to be set even though they're guarded by the `__nvvm_reflect`. Rather than figure out how to hack around that with the `target` attribute

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

2025-03-05 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/129927 Summary: These require `+ptx` features to be set even though they're guarded by the `__nvvm_reflect`. Rather than figure out how to hack around that with the `target` attribute I'm just going to disable it for 'g