tra added a comment. Looks good overall. Please do check that the generated PTX does get assembled by ptxas.
There are few newer variants of these instructions that appear to be missing. E.g. `{min/max}.xorsign.abs`. If you only intended to add instructions available in PTX-7.0, which, based on the constraints used in the patch, appears to be the case, I'd mention that in the commit log. ================ Comment at: clang/test/CodeGen/builtins-nvptx.c:822 + // CHECK_PTX70_SM80: call i16 @llvm.nvvm.fmin.bf16 + __nvvm_fmin_bf16(0x1234, 0x7FBF); + // CHECK_PTX70_SM80: call i16 @llvm.nvvm.fmin.nan.bf16 ---------------- I'd `#define` the magic values to give them sensible names. ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:580 - def int_nvvm_fmin_d : GCCBuiltin<"__nvvm_fmin_d">, - DefaultAttrsIntrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty], - [IntrNoMem, IntrSpeculatable, Commutative]>; - def int_nvvm_fmax_d : GCCBuiltin<"__nvvm_fmax_d">, - DefaultAttrsIntrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty], - [IntrNoMem, IntrSpeculatable, Commutative]>; + foreach capability = ["_f16", "_ftz_f16", "_nan_f16", "_ftz_nan_f16"] in { + def int_nvvm_f # operation # capability : ---------------- Nit: `variant` might work better here and below. ================ Comment at: llvm/lib/Target/NVPTX/NVPTXIntrinsics.td:620 +class MIN_MAX_TUPLE<string C, Intrinsic I, NVPTXRegClass RC> { + string Capacity = C; + Intrinsic Intr = I; ---------------- Ditto. Capacity->Variant. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D117887/new/ https://reviews.llvm.org/D117887 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits