tra added inline comments.
================ Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:460-468 +TARGET_BUILTIN(__nvvm_redux_sync_add_s32, "SiSii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_min_s32, "SiSii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_max_s32, "SiSii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_add_u32, "UiUii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_min_u32, "UiUii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_max_u32, "UiUii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_and_b32, "iii", "", SM_80) ---------------- Instead of creating one builtin per integer variant, can we use a more generic builtin `__nvvm_redux_sync_add_i`, similar to how we handle `__nvvm_atom_add_gen_i` ? ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:4103 +// redux.sync.add.u32 dst, src, membermask; +def int_nvvm_redux_sync_add_u32 : GCCBuiltin<"__nvvm_redux_sync_add_u32">, + Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], ---------------- This could also be consolidated into an overloaded intrinsic operating on `llvm_anyint_ty` ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:4105 + Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], + [IntrConvergent, IntrNoMem]>; + ---------------- Similar to `shfl`, these intrinsics probably need `IntrInaccessibleMemOnly` as they exchange data with other threads. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100124/new/ https://reviews.llvm.org/D100124 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits