[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

LuoYuanke via Phabricator via cfe-commits Thu, 05 Aug 2021 06:45:18 -0700

LuoYuanke added inline comments.


================
Comment at: llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:801
   //  0b00010: implied 0F 38 leading opcode bytes
   //  0b00011: implied 0F 3A leading opcode bytes
   //  0b00100-0b11111: Reserved for future use
----------------
Add comments for map5 and map6?


================
Comment at: llvm/lib/Target/X86/X86.td:189
+// guarded under condition hasVLX. So we imply it in FeatureFP16 currently.
+// FIXME: FP16 conversion between f16 and i64 customise type v8i64, which is
+// supposed to be guarded under condition hasDQI. So we imply it in FeatureFP16
----------------
customize?


================
Comment at: llvm/lib/Target/X86/X86FastISel.cpp:2291
+  case MVT::i16: Opc = X86::CMOV_GR16;  break;
+  case MVT::f16: Opc = X86::CMOV_FR16X; break;
+  case MVT::i32: Opc = X86::CMOV_GR32;  break;
----------------
Also add it in isCMOVPseudo()?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1946
+        setGroup(VT);
+      }
+      setOperationAction(ISD::SCALAR_TO_VECTOR,   MVT::v8f16,  Legal);
----------------
Drop the brace.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:10549
 
-      if (EltVT == MVT::i32 || EltVT == MVT::f32 || EltVT == MVT::f64 ||
-          (EltVT == MVT::i64 && Subtarget.is64Bit())) {
+      if (EltVT == MVT::i32 || EltVT == MVT::f16 || EltVT == MVT::f32 ||
+          EltVT == MVT::f64 || (EltVT == MVT::i64 && Subtarget.is64Bit()) ||
----------------
Need check Subtarget.hasFP16()?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:10551
+          EltVT == MVT::f64 || (EltVT == MVT::i64 && Subtarget.is64Bit()) ||
+          (EltVT == MVT::i16 && Subtarget.hasFP16())) {
         assert((VT.is128BitVector() || VT.is256BitVector() ||
----------------
Why handle i16? Isn't it handled by movw?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:10744
   // For SSE 4.1, use insertps to put the high elements into the low element.
-  if (Subtarget.hasSSE41()) {
+  if (Subtarget.hasSSE41() && EltVT != MVT::f16) {
     SDValue Result;
----------------
Why exclude f16? Is there better choice for fp16?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:19023
 
-    // SHUFPS the element to the lowest double word, then movss.
-    int Mask[4] = { static_cast<int>(IdxVal), -1, -1, -1 };
+    // SHUFPS the element to the lowest double word, then movsh.
+    SmallVector<int, 8> Mask(VecVT.getVectorNumElements(), -1);
----------------
movss/movsh


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105263/new/

https://reviews.llvm.org/D105263

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

Reply via email to