================ @@ -596,6 +605,28 @@ def __nvvm_e4m3x2_to_f16x2_rn_relu : NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(sh def __nvvm_e5m2x2_to_f16x2_rn : NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(short)", SM_89, PTX81>; def __nvvm_e5m2x2_to_f16x2_rn_relu : NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(short)", SM_89, PTX81>; +def __nvvm_ff_to_e2m3x2_rn : NVPTXBuiltinSMAndPTX<"short(float, float)", SM<"100a", [SM_101a, SM_120a]>, PTX86>; ---------------- Artem-B wrote:
The reason I'm asking is that underlying storage 'container' is not always the best representation of the type for the front-end, where we may need a more nuanced type information. The use of opaque integers historically stemps from NVCC and CUDA SDK using them because NVCC itself had no idea of non-standard types like fp16, so everything was done via inline asm on opaque types. That effectively prevents clang/LLVM from optimizing those types. Correctly conveying the type information to the compiler is useful, even if opaque types sort of work, too. I would argue that a 2-element vector of opaque uint_8 would be a better representation. While we do not have support for the native f8 types, it would at least let compiler know that it's a 2-element vector. That said, the benefits here and now are marginal, so in practical terms 'short' is OK, but I'm worrying about not painting ourselves in the corner long-term, as once these builtins land in a public release, changing them to something else would be hard. Long term we should probably address all the cases where we're using opaque types for things that compiler does know about (fp16/bf16, maybe eventually f8, too as its use becomes more widespread). For now, 'short' is fine. https://github.com/llvm/llvm-project/pull/134345 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits