================
@@ -460,6 +478,52 @@ def __nvvm_add_rz_d : NVPTXBuiltin<"double(double, 
double)">;
 def __nvvm_add_rm_d : NVPTXBuiltin<"double(double, double)">;
 def __nvvm_add_rp_d : NVPTXBuiltin<"double(double, double)">;
 
+def __nvvm_add_mixed_f16_f32 : NVPTXBuiltinSMAndPTX<"float(__fp16, float)", 
SM_100, PTX86>;
+def __nvvm_add_mixed_rn_f16_f32 : NVPTXBuiltinSMAndPTX<"float(__fp16, float)", 
SM_100, PTX86>;
----------------
Artem-B wrote:

This set of intrinsics appears to be regular enough to consider using tablegen 
loops to generate them.

Not sure if it's going to end up being an improvement, but if it would reduce 
the boilerplate, it may be worth giving it a try.

https://github.com/llvm/llvm-project/pull/168359
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to