================
@@ -596,6 +605,28 @@ def __nvvm_e4m3x2_to_f16x2_rn_relu : 
NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(sh
 def __nvvm_e5m2x2_to_f16x2_rn : NVPTXBuiltinSMAndPTX<"_Vector<2, 
__fp16>(short)", SM_89, PTX81>;
 def __nvvm_e5m2x2_to_f16x2_rn_relu : NVPTXBuiltinSMAndPTX<"_Vector<2, 
__fp16>(short)", SM_89, PTX81>;
 
+def __nvvm_ff_to_e2m3x2_rn : NVPTXBuiltinSMAndPTX<"short(float, float)", 
SM<"100a", [SM_101a, SM_120a]>, PTX86>;
----------------
Artem-B wrote:

The reason I'm asking is that underlying storage 'container' is not always the 
best representation of the type for the front-end, where we may need a more 
nuanced type information.
The use of opaque integers historically stemps from NVCC and CUDA SDK using 
them because NVCC itself had no idea of non-standard types like fp16, so 
everything was done via inline asm on opaque types. That effectively prevents 
clang/LLVM from optimizing those types.

Correctly conveying the type information to the compiler is useful, even if 
opaque types sort of work, too.

I would argue that a 2-element vector of opaque uint_8 would be a better 
representation. While we do not have support for the native f8 types, it would 
at least let compiler know that it's a 2-element vector.

That said, the benefits here and now are marginal, so in practical terms 
'short' is OK, but I'm worrying about not painting ourselves in the corner 
long-term, as once these builtins land in a public release, changing them to 
something else would be hard.

Long term we should probably address all the cases where we're using opaque 
types for things that compiler does know about (fp16/bf16, maybe eventually f8, 
too as its use becomes more widespread).

For now, 'short' is fine.

https://github.com/llvm/llvm-project/pull/134345
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to