tra added inline comments.
================ Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:460-468 +TARGET_BUILTIN(__nvvm_redux_sync_add_s32, "SiSii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_min_s32, "SiSii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_max_s32, "SiSii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_add_u32, "UiUii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_min_u32, "UiUii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_max_u32, "UiUii", "", SM_80) +TARGET_BUILTIN(__nvvm_redux_sync_and_b32, "iii", "", SM_80) ---------------- steffenlarsen wrote: > tra wrote: > > Instead of creating one builtin per integer variant, can we use a more > > generic builtin `__nvvm_redux_sync_add_i`, similar to how we handle > > `__nvvm_atom_add_gen_i` ? > > > What gives me pause is that a for atomic minimum there are both > `__nvvm_atom_min_gen_i` and `__nvvm_atom_min_gen_ui` to distinguish between > signed and unsigned. What makes the difference? > > That noted, I'll happily rename the builtins to be more in line with the > other builtins. `__nvvm_redux_sync_*_i` and `__nvvm_redux_sync_*_ui` maybe? > What gives me pause is that a for atomic minimum there are both > __nvvm_atom_min_gen_i and __nvvm_atom_min_gen_ui to distinguish between > signed and unsigned. What makes the difference? Good point. We do not need unsigned variant for `add`. We do need explicit signed and unsigned variants ad LLVM IR integer types do not take signedness into account, and the underlying min/max instructions do. Maybe, rename min_i/min_ui -> min/umin as LLVM does with atomics? We may skip the `_i` suffix on logical ops as they only apply to integers anyways. ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:4103 +// redux.sync.add.u32 dst, src, membermask; +def int_nvvm_redux_sync_add_u32 : GCCBuiltin<"__nvvm_redux_sync_add_u32">, + Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], ---------------- steffenlarsen wrote: > tra wrote: > > This could also be consolidated into an overloaded intrinsic operating on > > `llvm_anyint_ty` > I am having a look at other uses of this, but I'm having difficulty wrapping > my head around how to map these overloads to the PTX instructions in > llvm/lib/Target/NVPTX/NVPTXIntrinsics.td. Though it would be nice, it just > seems overly complicated to get a signed and an unsigned 32-bit integer > version of each of these intrinsics. Considering that `redux` only supports 32-bit integers, we do not need to get into it. `llvm_i32_ty` will do for now. We'll cross the bridge if/when we get to support multiple integer types. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100124/new/ https://reviews.llvm.org/D100124 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits