[PATCH] D91927: [X86] Add x86_amx type for intel AMX.

LuoYuanke via Phabricator via cfe-commits Tue, 22 Dec 2020 15:21:05 -0800

LuoYuanke added inline comments.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:264
+  SmallVector<Instruction *, 8> DeadInsts;
+  SmallVector<Instruction *, 2> DeadBitcasts;
+
----------------
pengfei wrote:
> Maybe better to use BitCastInst?
There may be dead load or store instructions.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:280
+      Type *Ty = Bitcast->getType();
+      auto CanonicalizeBitcast = [&]() {
+        bool Change = false;
----------------
pengfei wrote:
> Can we leave the canonicalize bitcast cases a single patch. It's a bit 
> complex here and I don't think it's a common case.
Ok, I'll create another patch for it.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:193
+    // %2 = load <256 x i32>, <256 x i32>* %addr, align 1024
+    auto *II = cast<IntrinsicInst>(Src);
+    Value *Row = II->getOperand(0);
----------------
pengfei wrote:
> Is it possible the x86_amx operand isn't from AMX intrinsic, e.g.
> ```
> %src = bitcast <256 x i32> %xxx to x86_amx
> %2 = bitcast x86_amx %src to <256 x i32>
> ```
Good catch. I'll add support for this pattern.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:238
     for (Instruction &Inst : BB) {
-      LoadInst *LD = dyn_cast<LoadInst>(&Inst);
-      // Check load instruction.
-      // %3 = load <256 x i32>, <256 x i32>* %1, align 64
-      if (LD) {
-        FixedVectorType *VTy = dyn_cast<FixedVectorType>(Inst.getType());
-        if (!IsAMXType(VTy))
-          continue;
-        LDSet.insert(&Inst);
+      if (!dyn_cast<BitCastInst>(&Inst))
         continue;
----------------
pengfei wrote:
> Better to reuse the cast result, e.g.
> ```
> BitCastInst *BInst = dyn_cast<BitCastInst>(&Inst);
> if (!BInst )
> ```
> You can save several `cast<BitCastInst>(&Inst)` below.
That's good. Thanks.


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-across-func.ll:89-91
 attributes #2 = { nounwind uwtable 
"correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" 
"frame-pointer"="none" "less-precise-fpmad"="false" 
"min-legal-vector-width"="8192" "no-infs-fp-math"="false" 
"no-jump-tables"="false" "no-nans-fp-math"="false" 
"no-signed-zeros-fp-math"="false" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-cpu"="x86-64" 
"target-features"="+amx-int8,+amx-tile,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" 
"tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
 attributes #3 = { "correctly-rounded-divide-sqrt-fp-math"="false" 
"disable-tail-calls"="false" "frame-pointer"="none" 
"less-precise-fpmad"="false" "no-infs-fp-math"="false" 
"no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="x86-64" 
"target-features"="+amx-int8,+amx-tile,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" 
"tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
 attributes #4 = { nounwind }
----------------
pengfei wrote:
> Better to remove these unused attributes. The same to other tests.
I'll create a separate patch to clean the attributes.


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-type.ll:67
+
+define dso_local void @test_src_add(<256 x i32> %x, <256 x i32> %y, i16 %r, 
i16 %c, i8* %buf, i64 %s) #2 {
+; CHECK-LABEL: @test_src_add(
----------------
pengfei wrote:
> For this and the next test, we have chances to optimize to memcpy if we can 
> make sure %s is constant 64.
If the stride is 64 we can transform the code to memcpy. How about do it in 
another patch?


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-type.ll:103
+
 define dso_local void @test_load(i8* %in, i8* %out) local_unnamed_addr #2 {
 ; CHECK-LABEL: @test_load(
----------------
pengfei wrote:
> We don't need to check this case now, right?
It can check the load and store instruction is not transformed if they are not 
participate in amx operation. I prefer to keep the case. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91927/new/

https://reviews.llvm.org/D91927

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D91927: [X86] Add x86_amx type for intel AMX.

Reply via email to