[PATCH] D116673: [Clang][NVPTX]Add NVPTX intrinsics and builtins for CUDA PTX cvt sm80 instructions

Jack Kirk via Phabricator via cfe-commits Wed, 12 Jan 2022 06:15:57 -0800

JackAKirk marked 2 inline comments as done.
JackAKirk added a comment.

ping @tra


I thought I should let you know that I do not have commit access to land the 
patch.  I'm also happy to wait a little longer in case you think other 
interested parties might still chime in.

Thanks



================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:405
 
+TARGET_BUILTIN(__nvvm_ff2v2bf_rn, "ZUiff", "", AND(SM_80,PTX70))
+TARGET_BUILTIN(__nvvm_ff2v2bf_rn_relu, "ZUiff", "", AND(SM_80,PTX70))
----------------
tra wrote:
> Nit: `ff2v2bf` is a bit hard to parse. I initially tried to interpret it as 
> "convert ff2v to bf" and was confused about  what exactly does `2v` part mean 
> -- we already have `ff` to denote two floats.
> 
> Perhaps `ff2bf16x2` would be a bit easier to read and understand. It would 
> also work consistently for `f16` and `tf32` variants below.
Thanks for the comment.  I also think your suggestion is better.  I've now 
switched to this new naming convention.


================
Comment at: clang/test/CodeGen/builtins-nvptx.c:760
+// CHECK-LABEL: nvvm_cvt_sm80
+__device__ void nvvm_cvt_sm80() {
+#if __CUDA_ARCH__ >= 800
----------------
tra wrote:
> Can you try compiling this test file all the way to .o so we're sure that 
> ptxas does accept the PTX we end up generating.
> 
Sure, I've attached the output ptx for the convert portion of the test file, 
along with the test: I added store operations in order for the convert 
instructions to not be optimised away. 

{F21499889}

{F21499888}

Also please be aware that there appears to be a bug in the existing 
builtins-nvptx.c test file that is apparent when trying to compile down to ptx 
using the shfl part of the test file: I get:
```
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.shfl.idx.f32
``` 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116673/new/

https://reviews.llvm.org/D116673

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D116673: [Clang][NVPTX]Add NVPTX intrinsics and builtins for CUDA PTX cvt sm80 instructions

Reply via email to