[clang] [llvm] AMDGPU: Support v_wmma_f32_16x16x128_f8f6f4 on gfx1250 (PR #149684)

Changpeng Fang via cfe-commits Mon, 21 Jul 2025 09:52:24 -0700

================
@@ -6627,6 +6627,54 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID, 
CallBase &Call) {
           "invalid vector type for format", &Call, Src1, 
Call.getArgOperand(5));
     break;
   }
+  case Intrinsic::amdgcn_wmma_f32_16x16x128_f8f6f4: {
+    Value *Src0 = Call.getArgOperand(1);
+    Value *Src1 = Call.getArgOperand(3);
+
+    unsigned FmtA = cast<ConstantInt>(Call.getArgOperand(0))->getZExtValue();
+    unsigned FmtB = cast<ConstantInt>(Call.getArgOperand(2))->getZExtValue();
+    Check(FmtA <= 4, "invalid value for matrix format", Call,
+          Call.getArgOperand(0));
+    Check(FmtB <= 4, "invalid value for matrix format", Call,
+          Call.getArgOperand(2));
+
+    // AMDGPU::MatrixFMT values
----------------
changpeng wrote:


> Most of this looks identical to the gfx950 mfma one, is it possible to merge 
> them

The operand layout is different from mfma, and merging them needs operand 
selection in multiple places.
Maybe keeping the current (separate) implementation is better for reading and 
understanding.

https://github.com/llvm/llvm-project/pull/149684
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Support v_wmma_f32_16x16x128_f8f6f4 on gfx1250 (PR #149684)

Reply via email to