[clang] [llvm] [X86][AMX-AVX512][NFC] Remove P from intrinsic and instruction name (PR #123270)

2025-01-17 Thread Feng Zou via cfe-commits
https://github.com/fzou1 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/123270 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [lld] [llvm] [X86][MC,LLD][NFC] Rename R_X86_64_REX2_GOTPCRELX (PR #116737)

2024-11-18 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/116737 >From c1716f030d8503b5a4742447ef8883d900521c34 Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Tue, 19 Nov 2024 11:19:17 +0800 Subject: [PATCH 1/2] [X86][MC,LLD][NFC] Rename R_X86_64_REX2_GOTPCRELX to R_X86_64_CODE

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
https://github.com/fzou1 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/115660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
@@ -0,0 +1,301 @@ +/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE --=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
@@ -0,0 +1,94 @@ +/*===- amxbf16transposeintrin.h - AMX-BF16 and AMX-TRANSPOSE === + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Apa

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
@@ -0,0 +1,301 @@ +/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE --=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
@@ -0,0 +1,94 @@ +/*===- amxfp16transposeintrin.h - AMX-FP16 and AMX-TRANSPOSE === + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Apa

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
@@ -0,0 +1,301 @@ +/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE --=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-13 Thread Feng Zou via cfe-commits
@@ -0,0 +1,301 @@ +/*===- amxcomplextransposeintrin.h - AMX-COMPLEX and AMX-TRANSPOSE --=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE, part 2 (PR #115660)

2024-11-12 Thread Feng Zou via cfe-commits
@@ -275,6 +276,27 @@ std::pair ShapeCalculator::getShape(IntrinsicInst *II, Col = II->getArgOperand(1); break; } + case Intrinsic::x86_ttdpbf16ps_internal: + case Intrinsic::x86_ttdpfp16ps_internal: + case Intrinsic::x86_ttcmmimfp16ps_internal: + case Intrinsic::

[clang] [llvm] [X86][AMX] Add AMX FP8 new APIs (PR #115829)

2024-11-12 Thread Feng Zou via cfe-commits
fzou1 wrote: > Missing IR test? Sorry. Added. Thanks. https://github.com/llvm/llvm-project/pull/115829 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [X86][AMX] Add AMX FP8 new APIs (PR #115829)

2024-11-12 Thread Feng Zou via cfe-commits
@@ -15,81 +15,214 @@ #define __AMXFP8INTRIN_H #ifdef __x86_64__ -/// Peform the dot product of a BF8 value \a a by a BF8 value \a b accumulating -/// into a Single Precision (FP32) source/dest \a dst. +#define __DEFAULT_FN_ATTRS_FP8

[clang] [llvm] [X86][AMX] Add AMX FP8 new APIs (PR #115829)

2024-11-12 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/115829 >From 9fd6e9e598423b6cc58a25fe70cc12a846483be5 Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 7 Nov 2024 11:56:17 +0800 Subject: [PATCH 1/2] [X86][AMX] Add AMX FP8 new APIs This is a follow-up to #113850. Re

[clang] [llvm] [X86][AMX] Add AMX FP8 new APIs (PR #115829)

2024-11-11 Thread Feng Zou via cfe-commits
https://github.com/fzou1 created https://github.com/llvm/llvm-project/pull/115829 This is a follow-up to #113850. Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368 >From 9fd6e9e598423b6cc58a25fe70cc12a846483be5 Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 7 Nov 2024 11:56:17 +0800

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-10 Thread Feng Zou via cfe-commits
@@ -6101,6 +6101,25 @@ let TargetPrefix = "x86" in { Intrinsic<[llvm_v16i32_ty], [llvm_i16_ty, llvm_i16_ty, llvm_x86amx_ty, llvm_i32_ty], []>; + + def int_x86_tmmultf32ps : ClangBuiltin<"__builtin_ia32_tmmultf32ps"

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-10 Thread Feng Zou via cfe-commits
@@ -660,6 +660,10 @@ _storebe_i64(void * __P, long long __D) { #include #endif +#if !defined(__SCE__) || __has_feature(modules) || defined(__AMX_TF32__) +#include +#endif + fzou1 wrote: Added. https://github.com/llvm/llvm-project/pull/115625 __

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-10 Thread Feng Zou via cfe-commits
@@ -151,6 +151,7 @@ set(x86_files amxfp16intrin.h amxfp8intrin.h amxintrin.h + amxtf32intrin.h fzou1 wrote: Sorry. Forgot to add it. Done. Thanks. https://github.com/llvm/llvm-project/pull/115625 ___ cfe-comm

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-10 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/115625 >From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Sat, 26 Oct 2024 18:44:32 +0800 Subject: [PATCH 1/4] [X86][AMX] Support AMX-TF32 Ref.: https://cdrdv2.intel.com/v1/dl/g

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-09 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/115625 >From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Sat, 26 Oct 2024 18:44:32 +0800 Subject: [PATCH 1/3] [X86][AMX] Support AMX-TF32 Ref.: https://cdrdv2.intel.com/v1/dl/g

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-09 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/115625 >From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Sat, 26 Oct 2024 18:44:32 +0800 Subject: [PATCH 1/2] [X86][AMX] Support AMX-TF32 Ref.: https://cdrdv2.intel.com/v1/dl/g

[clang] [llvm] [X86][AMX] Support AMX-TF32 (PR #115625)

2024-11-09 Thread Feng Zou via cfe-commits
https://github.com/fzou1 created https://github.com/llvm/llvm-project/pull/115625 Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368 >From b1d9799b99b45b5af2b63868c4c3b139dbf9378c Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Sat, 26 Oct 2024 18:44:32 +0800 Subject: [PATCH] [X86][AMX] Supp

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Feng Zou via cfe-commits
https://github.com/fzou1 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/114070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits
@@ -369,3 +369,150 @@ let Predicates = [HasAMXTRANSPOSE, In64BitMode] in { } } } // HasAMXTILE, HasAMXTRANSPOSE + +multiclass m_tcvtrowd2ps { + let Predicates = [HasAMXAVX512, In64BitMode] in { fzou1 wrote: Should add HasAVX10_2_512 in line 374, 390 and

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits
https://github.com/fzou1 edited https://github.com/llvm/llvm-project/pull/114070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits
https://github.com/fzou1 commented: LGTM except the last place probably missing avx10.2-512 dependency. https://github.com/llvm/llvm-project/pull/114070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/list

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits
@@ -0,0 +1,381 @@ +/*===- amxavx512intrin.h - AMXAVX512 === + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits
@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal, "vUsUsUsV256i*V256i*vC*z", TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z", "n", "amx-transpose") TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal, "vUsUsUsV256i*V256i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits
@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal, "vUsUsUsV256i*V256i*vC*z", TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z", "n", "amx-transpose") TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal, "vUsUsUsV256i*V256i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits
@@ -0,0 +1,381 @@ +/*===- amxavx512intrin.h - AMXAVX512 === + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits
@@ -0,0 +1,381 @@ +/*===- amxavx512intrin.h - AMXAVX512 === + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits
@@ -0,0 +1,381 @@ +/*===- amxavx512intrin.h - AMXAVX512 === + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits
@@ -559,12 +559,68 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB, return true; } case X86::PTILELOADDV: - case X86::PTILELOADDT1V: { + case X86::PTILELOADDT1V: + case X86::PTCVTROWD2PSrreV: + case X86::PTCVTROWD2PSrriV: + case X86::PTCVTROWPS2PBF16HrreV:

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-11-01 Thread Feng Zou via cfe-commits
https://github.com/fzou1 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/113532 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-11-01 Thread Feng Zou via cfe-commits
https://github.com/fzou1 deleted https://github.com/llvm/llvm-project/pull/113532 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -0,0 +1,248 @@ +/* ===--- amxtransposeintrin.h - AMX_TRANSPOSE intrinsics -*- C++ -*-=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -34,9 +34,31 @@ class ShapeT { if (MRI) deduceImm(MRI); } + // When ShapeT has mult shapes, we only use Shapes (never use Row and Col) + // and ImmShapes. Due to the most case is only one shape (just simply use + // Shape.Row or Shape.Col), so here we don't me

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -623,6 +623,37 @@ struct X86Operand final : public MCParsedAsmOperand { Inst.addOperand(MCOperand::createReg(Reg)); } + bool isTILEPair() const { +return Kind == Register && + X86MCRegisterClasses[X86::TILERegClassID].contains(getReg()); ---

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -0,0 +1,248 @@ +/* ===--- amxtransposeintrin.h - AMX_TRANSPOSE intrinsics -*- C++ -*-=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -919,23 +1017,66 @@ bool X86LowerAMXCast::optimizeAMXCastFromPhi( return true; } +static Value *getShapeFromAMXIntrinsic(Value *Inst, unsigned ShapeIdx, + bool IsRow) { + if (!isAMXIntrinsic(Inst)) +return nullptr; + + auto *II

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -34,9 +34,31 @@ class ShapeT { if (MRI) deduceImm(MRI); } + // When ShapeT has mult shapes, we only use Shapes (never use Row and Col) fzou1 wrote: mult -> multiple https://github.com/llvm/llvm-project/pull/113532 _

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -121,12 +137,96 @@ static Instruction *getFirstNonAllocaInTheEntryBlock(Function &F) { llvm_unreachable("No terminator in the entry block!"); } -static std::pair getShape(IntrinsicInst *II, unsigned OpNo) { +class ShapeCalculator { +private: + TargetMachine *TM = nullpt

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -16920,6 +16920,58 @@ Value *CodeGenFunction::EmitX86BuiltinExpr(unsigned BuiltinID, // instruction, but it will create a memset that won't be optimized away. return Builder.CreateMemSet(Ops[0], Ops[1], Ops[2], Align(1), true); } + // Corresponding to intrisics w

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -0,0 +1,248 @@ +/* ===--- amxtransposeintrin.h - AMX_TRANSPOSE intrinsics -*- C++ -*-=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Ap

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-31 Thread Feng Zou via cfe-commits
@@ -80,28 +80,41 @@ INITIALIZE_PASS_BEGIN(X86FastTileConfig, DEBUG_TYPE, INITIALIZE_PASS_END(X86FastTileConfig, DEBUG_TYPE, "Fast Tile Register Configure", false, false) -static bool isTileDef(MachineRegisterInfo *MRI, MachineInstr &MI) { +static unsigned g

[clang] [llvm] [X86][AMX] Support AMX-FP8 (PR #113850)

2024-10-29 Thread Feng Zou via cfe-commits
@@ -0,0 +1,83 @@ +/*===- amxfp8intrin.h - AMX intrinsics -*- C++ -*=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Apa

[clang] [llvm] [X86][AMX] Support AMX-FP8 (PR #113850)

2024-10-29 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/113850 >From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 24 Oct 2024 21:56:48 +0800 Subject: [PATCH 1/5] Support AMX-FP8 Ref.: https://cdrdv2.intel.com/v1/dl/getContent/67

[clang] [llvm] [X86][AMX] Support AMX-FP8 (PR #113850)

2024-10-29 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/113850 >From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 24 Oct 2024 21:56:48 +0800 Subject: [PATCH 1/3] Support AMX-FP8 Ref.: https://cdrdv2.intel.com/v1/dl/getContent/67

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-28 Thread Feng Zou via cfe-commits
@@ -568,6 +568,131 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB, MI.setDesc(TII->get(Opc)); return true; } + // TILEPAIRLOAD is just for TILEPair spill, we don't have corresponding + // AMX instruction to support it. So, split it to 2 load instructions:

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-28 Thread Feng Zou via cfe-commits
@@ -80,28 +80,41 @@ INITIALIZE_PASS_BEGIN(X86FastTileConfig, DEBUG_TYPE, INITIALIZE_PASS_END(X86FastTileConfig, DEBUG_TYPE, "Fast Tile Register Configure", false, false) -static bool isTileDef(MachineRegisterInfo *MRI, MachineInstr &MI) { +static unsigned g

[clang] [llvm] [X86][AMX] Support AMX-TRANSPOSE (PR #113532)

2024-10-28 Thread Feng Zou via cfe-commits
@@ -34,9 +34,31 @@ class ShapeT { if (MRI) deduceImm(MRI); } + // When ShapeT has mult shapes, we only use Shapes (never use Row and Col) + // and ImmShapes. Due to the most case is only one shape (just simply use + // Shape.Row or Shape.Col), so here we don't me

[clang] [llvm] [X86][AMX] Support AMX-FP8 (PR #113850)

2024-10-27 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/113850 >From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 24 Oct 2024 21:56:48 +0800 Subject: [PATCH 1/2] Support AMX-FP8 Ref.: https://cdrdv2.intel.com/v1/dl/getContent/67

[clang] [llvm] [X86][AMX] Support AMX-FP8 (PR #113850)

2024-10-27 Thread Feng Zou via cfe-commits
https://github.com/fzou1 created https://github.com/llvm/llvm-project/pull/113850 Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368 >From fd570cb8d41f5f94b61d515985245fc81aab633e Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 24 Oct 2024 21:56:48 +0800 Subject: [PATCH] Support AMX-FP8

[clang] [llvm] [X86] Support branch hint (PR #97721)

2024-07-09 Thread Feng Zou via cfe-commits
@@ -1815,12 +1822,12 @@ def : ProcModel<"pantherlake", AlderlakePModel, def : ProcModel<"clearwaterforest", AlderlakePModel, ProcessorFeatures.CWFFeatures, ProcessorFeatures.ADLTuning>; def : ProcModel<"graniterapids", SapphireRapidsModel, -Proce

[clang] [llvm] [X86] Support branch hint (PR #97721)

2024-07-04 Thread Feng Zou via cfe-commits
https://github.com/fzou1 updated https://github.com/llvm/llvm-project/pull/97721 >From 3c75e22504416afae288723aff34120d88b100db Mon Sep 17 00:00:00 2001 From: Feng Zou Date: Thu, 4 Jul 2024 15:43:12 +0800 Subject: [PATCH 1/2] Support branch hint For more details about this feature, please refer

[clang] [llvm] [X86] Support branch hint (PR #97721)

2024-07-04 Thread Feng Zou via cfe-commits
@@ -749,6 +749,11 @@ def TuningUseGLMDivSqrtCosts : SubtargetFeature<"use-glm-div-sqrt-costs", "UseGLMDivSqrtCosts", "true", "Use Goldmont specific floating point div/sqrt costs">; +// Starting with Redwood Cove architecture, the branch has branch taken hint +// (i

[clang] [llvm] [X86] Support branch hint (PR #97721)

2024-07-04 Thread Feng Zou via cfe-commits
fzou1 wrote: Thanks. It's simpler. What's the metadata for "!prof !0" and "!prof !1"? https://github.com/llvm/llvm-project/pull/97721 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-

[clang] [llvm] Support branch hint (PR #97721)

2024-07-04 Thread Feng Zou via cfe-commits
https://github.com/fzou1 created https://github.com/llvm/llvm-project/pull/97721 For more details about this feature, please refer to latest Intel 64 and IA-32 Architectures Optimization Reference Manual Volume 1: https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32