JackAKirk marked 2 inline comments as done. JackAKirk added a comment. ping @tra
I thought I should let you know that I do not have commit access to land the patch. I'm also happy to wait a little longer in case you think other interested parties might still chime in. Thanks ================ Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:405 +TARGET_BUILTIN(__nvvm_ff2v2bf_rn, "ZUiff", "", AND(SM_80,PTX70)) +TARGET_BUILTIN(__nvvm_ff2v2bf_rn_relu, "ZUiff", "", AND(SM_80,PTX70)) ---------------- tra wrote: > Nit: `ff2v2bf` is a bit hard to parse. I initially tried to interpret it as > "convert ff2v to bf" and was confused about what exactly does `2v` part mean > -- we already have `ff` to denote two floats. > > Perhaps `ff2bf16x2` would be a bit easier to read and understand. It would > also work consistently for `f16` and `tf32` variants below. Thanks for the comment. I also think your suggestion is better. I've now switched to this new naming convention. ================ Comment at: clang/test/CodeGen/builtins-nvptx.c:760 +// CHECK-LABEL: nvvm_cvt_sm80 +__device__ void nvvm_cvt_sm80() { +#if __CUDA_ARCH__ >= 800 ---------------- tra wrote: > Can you try compiling this test file all the way to .o so we're sure that > ptxas does accept the PTX we end up generating. > Sure, I've attached the output ptx for the convert portion of the test file, along with the test: I added store operations in order for the convert instructions to not be optimised away. {F21499889} {F21499888} Also please be aware that there appears to be a bug in the existing builtins-nvptx.c test file that is apparent when trying to compile down to ptx using the shfl part of the test file: I get: ``` fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.shfl.idx.f32 ``` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116673/new/ https://reviews.llvm.org/D116673 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits