https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/124929
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/124929
None
>From 03ea8ad4f2a7b6589539dbc137c356455225fc15 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Wed, 29 Jan 2025 14:49:05 +
Subject: [PATCH] Fix typo "tranpose"
---
clang/lib/Headers/amxtf32transposeint
https://github.com/jayfoad approved this pull request.
LGTM
https://github.com/llvm/llvm-project/pull/124616
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -699,7 +699,8 @@ def DS_PERMUTE_B32 : DS_1A1D_PERMUTE <"ds_permute_b32",
int_amdgcn_ds_permute>;
def DS_BPERMUTE_B32 : DS_1A1D_PERMUTE <"ds_bpermute_b32",
int_amdgcn_ds_bpermute>;
-def DS_BPERMUTE
@@ -1630,9 +1630,9 @@ class Record {
SmallVector Assertions;
SmallVector Dumps;
- // All superclasses in the inheritance forest in post-order (yes, it
+ // Direct superclasses, which are roots of the inheritance forest (yes, it
// must be a forest; diamond-shaped inhe
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/123072
>From f12511a0fd30c47ea08e6c126cae558215758183 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Wed, 15 Jan 2025 13:34:41 +
Subject: [PATCH 01/14] Change API of getSuperClasses
---
llvm/include/llvm/TableGen/
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/123072
>From f12511a0fd30c47ea08e6c126cae558215758183 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Wed, 15 Jan 2025 13:34:41 +
Subject: [PATCH 01/13] Change API of getSuperClasses
---
llvm/include/llvm/TableGen/
https://github.com/jayfoad edited
https://github.com/llvm/llvm-project/pull/123072
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1718,15 +1719,30 @@ class Record {
ArrayRef getAssertions() const { return Assertions; }
ArrayRef getDumps() const { return Dumps; }
- ArrayRef> getSuperClasses() const {
-return SuperClasses;
+ /// Append all superclasses in post-order to \p Classes.
+ void get
jayfoad wrote:
> Waiting for clang/mlir changes
All done now.
https://github.com/llvm/llvm-project/pull/123072
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/123072
>From f12511a0fd30c47ea08e6c126cae558215758183 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Wed, 15 Jan 2025 13:34:41 +
Subject: [PATCH 01/13] Change API of getSuperClasses
---
llvm/include/llvm/TableGen/
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/123072
>From f12511a0fd30c47ea08e6c126cae558215758183 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Wed, 15 Jan 2025 13:34:41 +
Subject: [PATCH 01/11] Change API of getSuperClasses
---
llvm/include/llvm/TableGen/
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/122880
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/122880
None
>From d9a92edae5d021eed39acbdb22fa195dff78315d Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Tue, 14 Jan 2025 10:00:41 +
Subject: [PATCH] [llvm-project] Fix typos mutli and mutliple. NFC.
---
.../cla
https://github.com/jayfoad approved this pull request.
https://github.com/llvm/llvm-project/pull/122169
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad approved this pull request.
https://github.com/llvm/llvm-project/pull/121736
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -78,7 +78,7 @@ static void BuildParentMap(MapTy& M, Stmt* S,
// The right thing to do is to give the OpaqueValueExpr its syntactic
// parent, then not reassign that when traversing the semantic expressions.
OpaqueValueExpr *OVE = cast(S);
-if (OVMode == OV_Tr
@@ -34,13 +34,13 @@ static void BuildParentMap(MapTy& M, Stmt* S,
case Stmt::PseudoObjectExprClass: {
PseudoObjectExpr *POE = cast(S);
-if (OVMode == OV_Opaque && M[POE->getSyntacticForm()])
+if (OVMode == OV_Opaque && M.contains(POE->getSyntacticForm()))
---
https://github.com/jayfoad commented:
No objections, just a couple of ideas for improvements. I have no idea if
`ParentMap` lookups are on any kind of hot path.
https://github.com/llvm/llvm-project/pull/121736
___
cfe-commits mailing list
cfe-commits@
https://github.com/jayfoad edited
https://github.com/llvm/llvm-project/pull/121736
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/99016
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -838,12 +838,14 @@ static TargetTypeInfo getTargetTypeInfo(const
TargetExtType *Ty) {
return TargetTypeInfo(PointerType::get(C, 0), TargetExtType::CanBeGlobal);
jayfoad wrote:
I'm now using the newly added `amdgcn.named.barrier` for testing.
https://git
@@ -4285,6 +4291,12 @@ void Verifier::visitAllocaInst(AllocaInst &AI) {
SmallPtrSet Visited;
Check(AI.getAllocatedType()->isSized(&Visited),
"Cannot allocate unsized type", &AI);
+ // Check if it's a target extension type that disallows being used on the
+ // stac
@@ -228,6 +228,8 @@ class StructType : public Type {
SCDB_NotContainsScalableVector = 32,
SCDB_ContainsNonGlobalTargetExtType = 64,
SCDB_NotContainsNonGlobalTargetExtType = 128,
+SCDB_ContainsNonLocalTargetExtType = 64,
+SCDB_NotContainsNonLocalTargetExtType
jayfoad wrote:
Ping. This is ready again for review.
https://github.com/llvm/llvm-project/pull/99016
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/99016
>From c2eda0aaeeaf9c8711cf64830df9bb3fee842f80 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Tue, 16 Jul 2024 11:29:05 +0100
Subject: [PATCH 1/5] [IR] Add TargetExtType::CanBeAlloca property
Add a property to al
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/99016
>From c2eda0aaeeaf9c8711cf64830df9bb3fee842f80 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Tue, 16 Jul 2024 11:29:05 +0100
Subject: [PATCH 1/4] [IR] Add TargetExtType::CanBeAlloca property
Add a property to al
@@ -1617,6 +1617,7 @@ const EnumEntry ElfHeaderMipsFlags[] = {
ENUM_ENT(EF_AMDGPU_MACH_AMDGCN_GFX90A, "gfx90a"),
\
ENUM_ENT(EF_AMDGPU_MACH_AMDGCN_GFX90C, "gfx90c"),
\
ENUM_ENT(EF_AMDGPU_MACH_AMDGCN_GFX940, "gfx940"),
@@ -146,6 +146,11 @@ define amdgpu_kernel void @test_kernel() {
; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=6
-mcpu=gfx9-generic -filetype=obj -O0 -o %t.o %s
; RUN: llvm-objdump -D --arch-name=amdgcn -mllvm
--amdhsa-code-object-version=6 --mcpu=gfx9-gene
jayfoad wrote:
> Use TargetInfo when deciding is an address space is compatible
Typo? "Use TargetInfo when deciding *if* an address space is compatible"
https://github.com/llvm/llvm-project/pull/115777
___
cfe-commits mailing list
cfe-commits@lists.ll
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/114795
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/114795
None
>From bcb149170d1eaf0a177deee63a9dc289dd55892b Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Mon, 4 Nov 2024 13:46:28 +
Subject: [PATCH] [llvm-project] Fix typo "propogate"
---
clang/test/Analysis/ma
jayfoad wrote:
> [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific
> info), start with `llvm.amdgcn.wavefrontsize`
Mismatched parentheses. (Also it's a bit longer than git likes.
https://github.com/llvm/llvm-project/pull/114481
_
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/113691
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/108970
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/113559
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
> The community doesn't add new builtin types particularly often, so having
> four leftover bits isn't actually that close to the limit for us.
I guess "close" is subjective.
> Is there a problem with keeping this change downstream until we get to the
> limit in community?
No,
@@ -839,6 +839,14 @@ Expected
TargetExtType::checkParams(TargetExtType *TTy) {
"target extension type riscv.vector.tuple should have one "
"type parameter and one integer parameter");
+ // Opaque types in the AMDGPU name space.
+ if (TTy->Name == "amdgcn.nam
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/113691
It is simple to create the struct body up front, now that we have
transitioned to opaque pointers.
>From 0fea81c2996a5476fec5681856191d55841e9f0f Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Fri, 25 Oct 2024
https://github.com/jayfoad approved this pull request.
Looks reasonable with the newline fix, but please wait a day in case other
reviewers have comments.
https://github.com/llvm/llvm-project/pull/113614
___
cfe-commits mailing list
cfe-commits@lists.
@@ -15,7 +15,15 @@
AMDGPU_TYPE(Name, Id, SingletonId, Width, Align)
#endif
+#ifndef AMDGPU_NAMED_BARRIER_TYPE
+#define AMDGPU_NAMED_BARRIER_TYPE(Name, Id, SingletonId, Width, Align, Scope) \
+ AMDGPU_TYPE(Name, Id, SingletonId, Width, Align)
+#endif
+
AMDGPU_OPAQUE_PTR_TYP
https://github.com/jayfoad edited
https://github.com/llvm/llvm-project/pull/113614
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
This is mostly so that we don't need to carry this change downstream, while we
have downstream patches that add a few builtin types. It seems inevitable that
it will need to be changed upstream pretty soon anyway.
https://github.com/llvm/llvm-project/pull/113559
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/109399
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/113559
BuiltinType::LastKind is currently 507 which is close to the current
limit of 511.
>From fe852c49f160d9b76e61f151bc857eb8493a47db Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Thu, 24 Oct 2024 13:41:31 +0100
S
@@ -603,26 +610,30 @@ Generic processor code objects are versioned. See
:ref:`amdgpu-generic-processor
- ``gfx1103``
work-item within this family.
- ``gfx1150``
https://github.com/jayfoad approved this pull request.
LGTM. The compiler currently treats it as identical to gfx1152, right?
https://github.com/llvm/llvm-project/pull/113138
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.or
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/112899
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/112899
None
>From 3a3b67f30cde766adaede4cc53bec340fbe5d99f Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Fri, 18 Oct 2024 13:53:51 +0100
Subject: [PATCH] Fix typo "instrinsic"
---
clang/utils/TableGen/RISCVVEmitter.
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/109656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/109656
This will be used in ASTContext::getTypeInfo which needs this
information for all builtin types, not just pointers.
>From 0ef4ea17a711a1ee95080bc1635ae9aa824df596 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date:
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/109400
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
> Although, revisiting this now, I still don't understand why they decided to
> include ALL spill opcodes in the prologue, but not only the SGPR spills?
> Clearly, none of the VGPR reloads really belong to the prologue.
>
> At a first glance, changing the isSpill(opcode) to isSG
https://github.com/jayfoad updated
https://github.com/llvm/llvm-project/pull/109400
>From ebffad800626acbdb06c74633c0950e24df755c8 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Fri, 20 Sep 2024 11:16:23 +0100
Subject: [PATCH 1/2] [clang-tools-extra] Use {} instead of std::nullopt to
initialize
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/109400
Follow up to #109133.
>From ebffad800626acbdb06c74633c0950e24df755c8 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Fri, 20 Sep 2024 11:16:23 +0100
Subject: [PATCH] [clang-tools-extra] Use {} instead of std::nu
jayfoad wrote:
+1 to @efriedma-quic and @jdoerfert's comments. DataLayout should remain as
generic as possible. Trying to encode a concept of "_the_ flat address space"
in it seems way too specific to one optimization for one or two targets.
https://github.com/llvm/llvm-project/pull/108786
___
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/109004
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/109004
Tweak encodeTypeForFunctionPointerAuth to handle all AMDGPU builtin
types generically instead of just __amdgpu_buffer_rsrc_t which happens
to be the only one defined so far.
>From c79c60d4488e7ddd1d8a2f0e141847
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/108968
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -4778,6 +4782,8 @@ bool Type::canHaveNullability(bool ResultIfUnknown) const
{
#include "clang/Basic/RISCVVTypes.def"
#define WASM_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
#include "clang/Basic/WebAssemblyReferenceTypes.def"
+#define AMDGPU_TYPE(Name, Id, Singleton
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/108970
None
>From 9c2e1d8470b7b1a0647baece8fa41cf8ce7b4f5d Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Tue, 17 Sep 2024 13:28:11 +0100
Subject: [PATCH] [Clang] Add and use mangleVendorType helper. NFC.
---
clang/l
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/108968
Remove the MangledName field since these types just use the normal Name
for mangling purposes.
>From 97fc6c239af3cadb61fe70cd07bb3b31c5da7a52 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Tue, 17 Sep 2024 12:5
@@ -196,8 +208,10 @@ define amdgpu_kernel void @add_i32_constant(ptr
addrspace(1) %out, ptr addrspace
; GFX11W32-NEXT:v_mbcnt_lo_u32_b32 v0, s1, 0
; GFX11W32-NEXT:; implicit-def: $vgpr1
; GFX11W32-NEXT:s_delay_alu instid0(VALU_DEP_1)
-; GFX11W32-NEXT:v_cmpx_eq_
jayfoad wrote:
> > Did you know that LLVM intentionally does not follow IWYU and favors
> > forward declarations:
> > https://llvm.org/docs/CodingStandards.html#include-as-little-as-possible
>
> Yes, but I actually do not see what part of the mentioned standard' section
> conflicts with the c
jayfoad wrote:
> Compiler messages on HIP SDK for Windows
Please rewrite this to say what the patch does or what problem it fixes.
https://github.com/llvm/llvm-project/pull/97668
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.ll
@@ -14,13 +14,14 @@
#define LLVM_CODEGEN_MACHINEBRANCHPROBABILITYINFO_H
#include "llvm/CodeGen/MachineBasicBlock.h"
-#include "llvm/CodeGen/MachinePassManager.h"
#include "llvm/Pass.h"
#include "llvm/Support/BranchProbability.h"
namespace llvm {
-class MachineBranchProb
@@ -402,34 +413,30 @@ Value
*AMDGPUAtomicOptimizerImpl::buildReduction(IRBuilder<> &B,
// Reduce within each pair of rows (i.e. 32 lanes).
assert(ST->hasPermLaneX16());
- V = B.CreateBitCast(V, IntNTy);
jayfoad wrote:
Please submit an NFC cleanup patch
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/95637
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
I'll merge to fix the build.
https://github.com/llvm/llvm-project/pull/95637
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/95373
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/95373
None
>From 6d326a96d2651f8836b29ff1e3edef022f41549e Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Thu, 13 Jun 2024 09:46:48 +0100
Subject: [PATCH] [llvm-project] Fix typo "seperate"
---
clang-tools-extra/clang
https://github.com/jayfoad approved this pull request.
Works for me.
https://github.com/llvm/llvm-project/pull/94639
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -785,6 +785,7 @@ enum : unsigned {
EF_AMDGPU_MACH_AMDGCN_GFX1200 = 0x048,
EF_AMDGPU_MACH_AMDGCN_RESERVED_0X49 = 0x049,
EF_AMDGPU_MACH_AMDGCN_GFX1151 = 0x04a,
+ EF_AMDGPU_MACH_AMDGCN_GFX1152 = 0x055,
jayfoad wrote:
This table
@@ -6,21 +6,21 @@
__attribute__((global))
void kernel(int *out) {
int i = 0;
- out[i++] = threadIdx.x; // CHECK: call noundef i32
@llvm.nvvm.read.ptx.sreg.tid.x()
- out[i++] = threadIdx.y; // CHECK: call noundef i32
@llvm.nvvm.read.ptx.sreg.tid.y()
- out[i++] = threadIdx
@@ -6,21 +6,21 @@
__attribute__((global))
void kernel(int *out) {
int i = 0;
- out[i++] = threadIdx.x; // CHECK: call noundef i32
@llvm.nvvm.read.ptx.sreg.tid.x()
- out[i++] = threadIdx.y; // CHECK: call noundef i32
@llvm.nvvm.read.ptx.sreg.tid.y()
- out[i++] = threadIdx
jayfoad wrote:
Is there really a good use case for this? Can you use regular stores to
addrspace(7) instead? @krzysz00
Also, do you really need a separate builtin for every legal type, or is there
some way they can be type-overloaded?
https://github.com/llvm/llvm-project/pull/94576
__
https://github.com/jayfoad approved this pull request.
LGTM.
Could also update `flang/cmake/modules/AddFlangOffloadRuntime.cmake` but I
don't really know if it's our responsibility to update Flang.
https://github.com/llvm/llvm-project/pull/94534
___
@@ -1534,6 +1534,12 @@ def FeatureISAVersion11_5_1 : FeatureSet<
FeatureVGPRSingleUseHintInsts,
Feature1_5xVGPRs])>;
+def FeatureISAVersion11_5_2 : FeatureSet<
jayfoad wrote:
I don't have a good answer to this except "it's what we normally do". Othe
jayfoad wrote:
There is a latent problem to do with convergence. If you add a new test case
like this:
```diff
diff --git a/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
b/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
index 238f6ab39e83..22995083293d 100644
--- a/llvm/test/CodeGen/AMDGPU/conv
https://github.com/jayfoad commented:
Does this need IR autoupgrade?
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -5496,6 +5496,9 @@ const char*
AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const {
NODE_NAME_CASE(LDS)
NODE_NAME_CASE(FPTRUNC_ROUND_UPWARD)
NODE_NAME_CASE(FPTRUNC_ROUND_DOWNWARD)
+ NODE_NAME_CASE(READLANE)
+ NODE_NAME_CASE(READFIRSTLANE)
+ NODE_NAME_CA
@@ -5496,6 +5496,9 @@ const char*
AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const {
NODE_NAME_CASE(LDS)
NODE_NAME_CASE(FPTRUNC_ROUND_UPWARD)
NODE_NAME_CASE(FPTRUNC_ROUND_DOWNWARD)
+ NODE_NAME_CASE(READLANE)
+ NODE_NAME_CASE(READFIRSTLANE)
---
https://github.com/jayfoad edited
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
> 1. What's the proper way to legalize f16 and bf16 for SDAG case without
> bitcasts ? (I would think "fp_extend -> LaneOp -> Fptrunc" is wrong)
Bitcast to i16, anyext to i32, laneop, trunc to i16, bitcast to original type.
Why wouldn't you use bitcasts?
https://github.com/llv
https://github.com/jayfoad closed
https://github.com/llvm/llvm-project/pull/92232
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jayfoad created
https://github.com/llvm/llvm-project/pull/92232
None
>From a02c63497b0d60f55e1846f5a050820082fb5c86 Mon Sep 17 00:00:00 2001
From: Jay Foad
Date: Wed, 15 May 2024 10:04:57 +0100
Subject: [PATCH] Fix typo "indicies"
---
clang/include/clang/AST/VTTBuilder.h
@@ -493,8 +493,8 @@ Value *AMDGPUAtomicOptimizerImpl::buildScan(IRBuilder<> &B,
if (!ST->isWave32()) {
// Combine lane 31 into lanes 32..63.
V = B.CreateBitCast(V, IntNTy);
- Value *const Lane31 = B.CreateIntrinsic(Intrinsic::amdgcn_readlane, {},
-
https://github.com/jayfoad edited
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -5386,6 +5386,153 @@ bool
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
return true;
}
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+
@@ -5386,6 +5386,130 @@ bool
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
return true;
}
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+
@@ -5386,6 +5386,153 @@ bool
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
return true;
}
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+
https://github.com/jayfoad commented:
LGTM overall.
> add f32 pattern to select read/writelane operations
Why would you need this? Don't you legalize f32 to i32?
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits
jayfoad wrote:
AMDGPU changes are fine.
https://github.com/llvm/llvm-project/pull/90391
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
Previous attempts:
* https://reviews.llvm.org/D84639
* https://reviews.llvm.org/D86154
* https://reviews.llvm.org/D147732
* #87334
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
ht
jayfoad wrote:
No further comments.
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jayfoad wrote:
Can you add at least one test for a VMEM (flat or scratch or global or buffer
or image) atomic without return? That should use vscnt on GFX10.
Apart from that the SIInsertWaitcnts.cpp and tests look good to me. I have not
reviewed the clang parts but it looks like @Pierre-vh app
@@ -0,0 +1,1406 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 4
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci
@@ -0,0 +1,1406 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 4
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci
@@ -2594,12 +2594,10 @@ bool SIMemoryLegalizer::expandAtomicCmpxchgOrRmw(const
SIMemOpInfo &MOI,
MOI.getOrdering() == AtomicOrdering::SequentiallyConsistent ||
MOI.getFailureOrdering() == AtomicOrdering::Acquire ||
MOI.getFailureOrdering() == AtomicOrde
@@ -2326,6 +2326,20 @@ bool
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
}
#endif
+if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+ AMDGPU::Waitcnt Wait;
+ if (ST->hasExtendedWaitCounts())
+Wait = AMDGPU::Waitcnt(0, 0, 0,
1 - 100 of 239 matches
Mail list logo