ssahasra wrote:
Note that the best way to see the effect of this PR is to view only the second
diff of the two in this PR. It shows how the missing vmcnt(0) shows up in the
new test introduced by the first commit.
https://github.com/llvm/llvm-project/pull/147258
___
https://github.com/ssahasra updated
https://github.com/llvm/llvm-project/pull/147258
>From 95ffad8e0c22f261999f8a87abde8592c0596395 Mon Sep 17 00:00:00 2001
From: Sameer Sahasrabuddhe
Date: Tue, 17 Jun 2025 13:11:55 +0530
Subject: [PATCH 1/2] [AMDGCN] pre-checkin test for LDS DMA and release
o
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/147257
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
; GFX12-WGP-NEXT:s_wait_kmcnt 0x0
; GFX12-WGP-NEXT:s_wait_storecnt 0x0
; GFX12-WGP-NEXT:global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT:s_wait_loadcnt 0x3f
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
; GFX12-WGP-NEXT:s_wait_kmcnt 0x0
; GFX12-WGP-NEXT:s_wait_storecnt 0x0
; GFX12-WGP-NEXT:global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT:s_wait_loadcnt 0x3f
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
; GFX12-WGP-NEXT:s_wait_kmcnt 0x0
; GFX12-WGP-NEXT:s_wait_storecnt 0x0
; GFX12-WGP-NEXT:global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT:s_wait_loadcnt 0x3f
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
; GFX12-WGP-NEXT:s_wait_kmcnt 0x0
; GFX12-WGP-NEXT:s_wait_storecnt 0x0
; GFX12-WGP-NEXT:global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT:s_wait_loadcnt 0x3f
ssahasra wrote:
This is part of a stack:
- #147258
- #147257
- #147256
https://github.com/llvm/llvm-project/pull/147258
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bran
ssahasra wrote:
This is part of a stack:
- #147258
- #147257
- #147256
https://github.com/llvm/llvm-project/pull/147257
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bran
https://github.com/ssahasra created
https://github.com/llvm/llvm-project/pull/147258
Currently, the memory legalizer does not generate any wait on vmcnt at workgroup
scope. This is incorrect because direct loads to LDS are tracked using vmcnt and
they need to be released properly at workgroup sc
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/136282
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/136282
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra created
https://github.com/llvm/llvm-project/pull/136282
This introduces the `-fconvergence-control` flag that emits convergence control
intrinsics which are then used as the `convergencectrl` operand bundle on
convergent calls.
This also redefines the `noconvergen
https://github.com/ssahasra commented:
The changes to UA look good to me. I can't comment much about the actual patch
itself.
https://github.com/llvm/llvm-project/pull/124298
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
http
@@ -342,6 +342,10 @@ template class
GenericUniformityAnalysisImpl {
typename SyncDependenceAnalysisT::DivergenceDescriptor;
using BlockLabelMapT = typename SyncDependenceAnalysisT::BlockLabelMap;
+ // Use outside cycle with divergent exit
+ using UOCWDE =
-
@@ -188,6 +190,37 @@ void
DivergenceLoweringHelper::constrainAsLaneMask(Incoming &In) {
In.Reg = Copy.getReg(0);
}
+void replaceUsesOfRegInInstWith(Register Reg, MachineInstr *Inst,
+Register NewReg) {
+ for (MachineOperand &Op : Inst->opera
@@ -342,6 +342,10 @@ template class
GenericUniformityAnalysisImpl {
typename SyncDependenceAnalysisT::DivergenceDescriptor;
using BlockLabelMapT = typename SyncDependenceAnalysisT::BlockLabelMap;
+ // Use outside cycle with divergent exit
+ using UOCWDE =
-
@@ -1210,6 +1240,13 @@ void
GenericUniformityAnalysisImpl::print(raw_ostream &OS) const {
}
}
+template
+iterator_range::UOCWDE *>
ssahasra wrote:
Just say ``auto`` as the return type here? Or if this needs to be exposed in an
outer header file, then nam
@@ -40,6 +40,10 @@ template class GenericUniformityInfo {
using CycleInfoT = GenericCycleInfo;
using CycleT = typename CycleInfoT::CycleT;
+ // Use outside cycle with divergent exit
+ using UOCWDE =
ssahasra wrote:
This declaration got repeated. One of
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/124298
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
@@ -395,6 +399,14 @@ template class
GenericUniformityAnalysisImpl {
}
void print(raw_ostream &out) const;
+ SmallVector UsesOutsideCycleWithDivergentExit;
+ void recordUseOutsideCycleWithDivergentExit(const InstructionT *,
ssahasra wrote:
You're right
@@ -395,6 +399,14 @@ template class
GenericUniformityAnalysisImpl {
}
void print(raw_ostream &out) const;
+ SmallVector UsesOutsideCycleWithDivergentExit;
+ void recordUseOutsideCycleWithDivergentExit(const InstructionT *,
ssahasra wrote:
Everywhere i
@@ -622,9 +622,9 @@ bool ItaniumParamParser::parseItaniumParam(StringRef& param,
if (isDigit(TC)) {
res.ArgType =
StringSwitch(eatLengthPrefixedName(param))
-.Case("ocl_image1darray", AMDGPULibFunc::IMG1DA)
-.Case("ocl_image1dbuffer", AMDGP
https://github.com/ssahasra approved this pull request.
https://github.com/llvm/llvm-project/pull/119832
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra reopened
https://github.com/llvm/llvm-project/pull/101386
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra closed
https://github.com/llvm/llvm-project/pull/101386
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/101386
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
ssahasra wrote:
> Note that I have not yet finished verifying all the lit tests. I might also
> have to add a few more tests, especially involving a mix of irreducible and
> reducible cycles that are siblings and/or nested inside each other in various
> combinations. Especially with some overl
ssahasra wrote:
> This needs a finer method that redirects only specific edges. Either that, or
> we let the pass destroy some cycles. But updating `CycleInfo` for these
> missing subcycles may be a fair amount of work too, so I would rather do it
> the right way.
This now depends on the newl
@@ -107,6 +107,12 @@ template class GenericCycle {
return is_contained(Entries, Block);
}
+ /// \brief Replace all entries with \p Block as single entry.
+ void setSingleEntry(BlockT *Block) {
+Entries.clear();
+Entries.push_back(Block);
ssaha
@@ -189,6 +195,21 @@ template class GenericCycle {
//@{
using const_entry_iterator =
typename SmallVectorImpl::const_iterator;
+ const_entry_iterator entry_begin() const {
+return const_entry_iterator{Entries.begin()};
ssahasra wrote:
Fixed.
h
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/101386
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra closed
https://github.com/llvm/llvm-project/pull/103014
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra created
https://github.com/llvm/llvm-project/pull/103014
1. CycleInfo efficiently locates all cycles in a single pass, while the SCC is
repeated inside every natural loop.
2. CycleInfo provides a hierarchy of irreducible cycles, and the new
implementation transform
https://github.com/ssahasra edited
https://github.com/llvm/llvm-project/pull/103013
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
https://github.com/ssahasra created
https://github.com/llvm/llvm-project/pull/103013
CreateControlFlowHub is a method that redirects control flow edges from a set
of incoming blocks to a set of outgoing blocks through a new set of "guard"
blocks. This is now refactored into a separate file wit
ssahasra wrote:
The apparent change here is to simply reverse the effect of #100952 on the lit
test. Would be good to have a test which shows what the improvement is.
Also, I think #100952 merely enables AAIndirectCallInfo, and feels like an
integral part of this change itself. I would lean to
Author: Sameer Sahasrabuddhe
Date: 2021-01-20T22:02:09+05:30
New Revision: c540ce9900ff99566b4951186e2f070b3b36cdbe
URL:
https://github.com/llvm/llvm-project/commit/c540ce9900ff99566b4951186e2f070b3b36cdbe
DIFF:
https://github.com/llvm/llvm-project/commit/c540ce9900ff99566b4951186e2f070b3b36cdb
38 matches
Mail list logo