https://github.com/fzou1 approved this pull request.
LGTM
https://github.com/llvm/llvm-project/pull/123270
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/116737
>From c1716f030d8503b5a4742447ef8883d900521c34 Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Tue, 19 Nov 2024 11:19:17 +0800
Subject: [PATCH 1/2] [X86][MC,LLD][NFC] Rename R_X86_64_REX2_GOTPCRELX to
R_X86_64_CODE
https://github.com/fzou1 approved this pull request.
LGTM
https://github.com/llvm/llvm-project/pull/115660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,301 @@
+/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE
--===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -0,0 +1,94 @@
+/*===- amxbf16transposeintrin.h - AMX-BF16 and AMX-TRANSPOSE
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apa
@@ -0,0 +1,301 @@
+/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE
--===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -0,0 +1,94 @@
+/*===- amxfp16transposeintrin.h - AMX-FP16 and AMX-TRANSPOSE
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apa
@@ -0,0 +1,301 @@
+/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE
--===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -0,0 +1,301 @@
+/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE
--===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -275,6 +276,27 @@ std::pair
ShapeCalculator::getShape(IntrinsicInst *II,
Col = II->getArgOperand(1);
break;
}
+ case Intrinsic::x86_ttdpbf16ps_internal:
+ case Intrinsic::x86_ttdpfp16ps_internal:
+ case Intrinsic::x86_ttcmmimfp16ps_internal:
+ case Intrinsic::
fzou1 wrote:
> Missing IR test?
Sorry. Added. Thanks.
https://github.com/llvm/llvm-project/pull/115829
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -15,81 +15,214 @@
#define __AMXFP8INTRIN_H
#ifdef __x86_64__
-/// Peform the dot product of a BF8 value \a a by a BF8 value \a b accumulating
-/// into a Single Precision (FP32) source/dest \a dst.
+#define __DEFAULT_FN_ATTRS_FP8
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/115829
>From 9fd6e9e598423b6cc58a25fe70cc12a846483be5 Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 7 Nov 2024 11:56:17 +0800
Subject: [PATCH 1/2] [X86][AMX] Add AMX FP8 new APIs
This is a follow-up to #113850.
Re
https://github.com/fzou1 created
https://github.com/llvm/llvm-project/pull/115829
This is a follow-up to #113850.
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
>From 9fd6e9e598423b6cc58a25fe70cc12a846483be5 Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 7 Nov 2024 11:56:17 +0800
@@ -6101,6 +6101,25 @@ let TargetPrefix = "x86" in {
Intrinsic<[llvm_v16i32_ty],
[llvm_i16_ty, llvm_i16_ty, llvm_x86amx_ty,
llvm_i32_ty],
[]>;
+
+ def int_x86_tmmultf32ps : ClangBuiltin<"__builtin_ia32_tmmultf32ps"
@@ -660,6 +660,10 @@ _storebe_i64(void * __P, long long __D) {
#include
#endif
+#if !defined(__SCE__) || __has_feature(modules) || defined(__AMX_TF32__)
+#include
+#endif
+
fzou1 wrote:
Added.
https://github.com/llvm/llvm-project/pull/115625
__
@@ -151,6 +151,7 @@ set(x86_files
amxfp16intrin.h
amxfp8intrin.h
amxintrin.h
+ amxtf32intrin.h
fzou1 wrote:
Sorry. Forgot to add it. Done. Thanks.
https://github.com/llvm/llvm-project/pull/115625
___
cfe-comm
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/115625
>From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Sat, 26 Oct 2024 18:44:32 +0800
Subject: [PATCH 1/4] [X86][AMX] Support AMX-TF32
Ref.: https://cdrdv2.intel.com/v1/dl/g
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/115625
>From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Sat, 26 Oct 2024 18:44:32 +0800
Subject: [PATCH 1/3] [X86][AMX] Support AMX-TF32
Ref.: https://cdrdv2.intel.com/v1/dl/g
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/115625
>From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Sat, 26 Oct 2024 18:44:32 +0800
Subject: [PATCH 1/2] [X86][AMX] Support AMX-TF32
Ref.: https://cdrdv2.intel.com/v1/dl/g
https://github.com/fzou1 created
https://github.com/llvm/llvm-project/pull/115625
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
>From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Sat, 26 Oct 2024 18:44:32 +0800
Subject: [PATCH] [X86][AMX] Supp
https://github.com/fzou1 approved this pull request.
LGTM
https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -369,3 +369,150 @@ let Predicates = [HasAMXTRANSPOSE, In64BitMode] in {
}
}
} // HasAMXTILE, HasAMXTRANSPOSE
+
+multiclass m_tcvtrowd2ps {
+ let Predicates = [HasAMXAVX512, In64BitMode] in {
fzou1 wrote:
Should add HasAVX10_2_512 in line 374, 390 and
https://github.com/fzou1 edited https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/fzou1 commented:
LGTM except the last place probably missing avx10.2-512 dependency.
https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/list
@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal,
"vUsUsUsV256i*V256i*vC*z",
TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z",
"n", "amx-transpose")
TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal,
"vUsUsUsV256i*V256i
@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal,
"vUsUsUsV256i*V256i*vC*z",
TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z",
"n", "amx-transpose")
TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal,
"vUsUsUsV256i*V256i
@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -559,12 +559,68 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB,
return true;
}
case X86::PTILELOADDV:
- case X86::PTILELOADDT1V: {
+ case X86::PTILELOADDT1V:
+ case X86::PTCVTROWD2PSrreV:
+ case X86::PTCVTROWD2PSrriV:
+ case X86::PTCVTROWPS2PBF16HrreV:
https://github.com/fzou1 approved this pull request.
LGTM
https://github.com/llvm/llvm-project/pull/113532
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/fzou1 deleted
https://github.com/llvm/llvm-project/pull/113532
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,248 @@
+/* ===--- amxtransposeintrin.h - AMX_TRANSPOSE intrinsics -*- C++
-*-===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -34,9 +34,31 @@ class ShapeT {
if (MRI)
deduceImm(MRI);
}
+ // When ShapeT has mult shapes, we only use Shapes (never use Row and Col)
+ // and ImmShapes. Due to the most case is only one shape (just simply use
+ // Shape.Row or Shape.Col), so here we don't me
@@ -623,6 +623,37 @@ struct X86Operand final : public MCParsedAsmOperand {
Inst.addOperand(MCOperand::createReg(Reg));
}
+ bool isTILEPair() const {
+return Kind == Register &&
+ X86MCRegisterClasses[X86::TILERegClassID].contains(getReg());
---
@@ -0,0 +1,248 @@
+/* ===--- amxtransposeintrin.h - AMX_TRANSPOSE intrinsics -*- C++
-*-===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -919,23 +1017,66 @@ bool X86LowerAMXCast::optimizeAMXCastFromPhi(
return true;
}
+static Value *getShapeFromAMXIntrinsic(Value *Inst, unsigned ShapeIdx,
+ bool IsRow) {
+ if (!isAMXIntrinsic(Inst))
+return nullptr;
+
+ auto *II
@@ -34,9 +34,31 @@ class ShapeT {
if (MRI)
deduceImm(MRI);
}
+ // When ShapeT has mult shapes, we only use Shapes (never use Row and Col)
fzou1 wrote:
mult -> multiple
https://github.com/llvm/llvm-project/pull/113532
_
@@ -121,12 +137,96 @@ static Instruction
*getFirstNonAllocaInTheEntryBlock(Function &F) {
llvm_unreachable("No terminator in the entry block!");
}
-static std::pair getShape(IntrinsicInst *II, unsigned OpNo) {
+class ShapeCalculator {
+private:
+ TargetMachine *TM = nullpt
@@ -16920,6 +16920,58 @@ Value *CodeGenFunction::EmitX86BuiltinExpr(unsigned
BuiltinID,
// instruction, but it will create a memset that won't be optimized away.
return Builder.CreateMemSet(Ops[0], Ops[1], Ops[2], Align(1), true);
}
+ // Corresponding to intrisics w
@@ -0,0 +1,248 @@
+/* ===--- amxtransposeintrin.h - AMX_TRANSPOSE intrinsics -*- C++
-*-===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Ap
@@ -80,28 +80,41 @@ INITIALIZE_PASS_BEGIN(X86FastTileConfig, DEBUG_TYPE,
INITIALIZE_PASS_END(X86FastTileConfig, DEBUG_TYPE,
"Fast Tile Register Configure", false, false)
-static bool isTileDef(MachineRegisterInfo *MRI, MachineInstr &MI) {
+static unsigned g
@@ -0,0 +1,83 @@
+/*===- amxfp8intrin.h - AMX intrinsics -*- C++
-*===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apa
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/113850
>From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 24 Oct 2024 21:56:48 +0800
Subject: [PATCH 1/5] Support AMX-FP8
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/67
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/113850
>From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 24 Oct 2024 21:56:48 +0800
Subject: [PATCH 1/3] Support AMX-FP8
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/67
@@ -568,6 +568,131 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB,
MI.setDesc(TII->get(Opc));
return true;
}
+ // TILEPAIRLOAD is just for TILEPair spill, we don't have corresponding
+ // AMX instruction to support it. So, split it to 2 load instructions:
@@ -80,28 +80,41 @@ INITIALIZE_PASS_BEGIN(X86FastTileConfig, DEBUG_TYPE,
INITIALIZE_PASS_END(X86FastTileConfig, DEBUG_TYPE,
"Fast Tile Register Configure", false, false)
-static bool isTileDef(MachineRegisterInfo *MRI, MachineInstr &MI) {
+static unsigned g
@@ -34,9 +34,31 @@ class ShapeT {
if (MRI)
deduceImm(MRI);
}
+ // When ShapeT has mult shapes, we only use Shapes (never use Row and Col)
+ // and ImmShapes. Due to the most case is only one shape (just simply use
+ // Shape.Row or Shape.Col), so here we don't me
https://github.com/fzou1 updated
https://github.com/llvm/llvm-project/pull/113850
>From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 24 Oct 2024 21:56:48 +0800
Subject: [PATCH 1/2] Support AMX-FP8
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/67
https://github.com/fzou1 created
https://github.com/llvm/llvm-project/pull/113850
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
>From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 24 Oct 2024 21:56:48 +0800
Subject: [PATCH] Support AMX-FP8
@@ -1815,12 +1822,12 @@ def : ProcModel<"pantherlake", AlderlakePModel,
def : ProcModel<"clearwaterforest", AlderlakePModel,
ProcessorFeatures.CWFFeatures, ProcessorFeatures.ADLTuning>;
def : ProcModel<"graniterapids", SapphireRapidsModel,
-Proce
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/97721
>From 3c75e22504416afae288723aff34120d88b100db Mon Sep 17 00:00:00 2001
From: Feng Zou
Date: Thu, 4 Jul 2024 15:43:12 +0800
Subject: [PATCH 1/2] Support branch hint
For more details about this feature, please refer
@@ -749,6 +749,11 @@ def TuningUseGLMDivSqrtCosts
: SubtargetFeature<"use-glm-div-sqrt-costs", "UseGLMDivSqrtCosts", "true",
"Use Goldmont specific floating point div/sqrt costs">;
+// Starting with Redwood Cove architecture, the branch has branch taken hint
+// (i
fzou1 wrote:
Thanks. It's simpler. What's the metadata for "!prof !0" and "!prof !1"?
https://github.com/llvm/llvm-project/pull/97721
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-
https://github.com/fzou1 created https://github.com/llvm/llvm-project/pull/97721
For more details about this feature, please refer to latest Intel 64 and IA-32
Architectures Optimization Reference Manual Volume 1:
https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32
57 matches
Mail list logo