[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/125585

Backport d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87

Requested by: @tstellar

>From adf607aa5622a6e3a83a4016bc87f2c8321c47c7 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Mon, 3 Feb 2025 13:13:11 -0800
Subject: [PATCH] workflows/release-tasks: Re-use release-binaries-all workflow
 (#125378)

This way we don't need to duplicate the list of supported targets in the
release-tasks workflow.

(cherry picked from commit d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87)
---
 .github/workflows/release-tasks.yml | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/.github/workflows/release-tasks.yml 
b/.github/workflows/release-tasks.yml
index 780dd0ff6325c9..52076ea1821b0b 100644
--- a/.github/workflows/release-tasks.yml
+++ b/.github/workflows/release-tasks.yml
@@ -89,20 +89,10 @@ jobs:
 needs:
   - validate-tag
   - release-create
-strategy:
-  fail-fast: false
-  matrix:
-runs-on:
-  - ubuntu-22.04
-  - windows-2022
-  - macos-13
-  - macos-14
-
-uses: ./.github/workflows/release-binaries.yml
+uses: ./.github/workflows/release-binaries-all.yml
 with:
   release-version: ${{ needs.validate-tag.outputs.release-version }}
   upload: true
-  runs-on: ${{ matrix.runs-on }}
 # Called workflows don't have access to secrets by default, so we need to 
explicitly pass secrets that we use.
 secrets:
   RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-github-workflow

Author: None (llvmbot)


Changes

Backport d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87

Requested by: @tstellar

---
Full diff: https://github.com/llvm/llvm-project/pull/125585.diff


1 Files Affected:

- (modified) .github/workflows/release-tasks.yml (+1-11) 


``diff
diff --git a/.github/workflows/release-tasks.yml 
b/.github/workflows/release-tasks.yml
index 780dd0ff6325c9..52076ea1821b0b 100644
--- a/.github/workflows/release-tasks.yml
+++ b/.github/workflows/release-tasks.yml
@@ -89,20 +89,10 @@ jobs:
 needs:
   - validate-tag
   - release-create
-strategy:
-  fail-fast: false
-  matrix:
-runs-on:
-  - ubuntu-22.04
-  - windows-2022
-  - macos-13
-  - macos-14
-
-uses: ./.github/workflows/release-binaries.yml
+uses: ./.github/workflows/release-binaries-all.yml
 with:
   release-version: ${{ needs.validate-tag.outputs.release-version }}
   upload: true
-  runs-on: ${{ matrix.runs-on }}
 # Called workflows don't have access to secrets by default, so we need to 
explicitly pass secrets that we use.
 secrets:
   RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }}

``




https://github.com/llvm/llvm-project/pull/125585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/125585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread Quentin Colombet via llvm-branch-commits

https://github.com/qcolombet approved this pull request.


https://github.com/llvm/llvm-project/pull/125535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)

2025-02-03 Thread Quentin Colombet via llvm-branch-commits


@@ -1002,17 +1003,15 @@ bool PeepholeOptimizer::optimizeCondBranch(MachineInstr 
&MI) {
 /// share the same register file as \p Reg and \p SubReg. The client should
 /// then be capable to rewrite all intermediate PHIs to get the next source.
 /// \return False if no alternative sources are available. True otherwise.
-bool PeepholeOptimizer::findNextSource(RegSubRegPair RegSubReg,

qcolombet wrote:

Could you update the comment with the documentation for the additional 
parameters.

https://github.com/llvm/llvm-project/pull/125533
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid stack usage in bitcast operand promotion to legal vector (PR #125637)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/125637?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#125637** https://app.graphite.dev/github/pr/llvm/llvm-project/125637?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/125637?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#125636** https://app.graphite.dev/github/pr/llvm/llvm-project/125636?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/125637
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid stack usage in bitcast operand promotion to legal vector (PR #125637)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-selectiondag

Author: Matt Arsenault (arsenm)


Changes

Fix introducing stack usage if a bitcast source operand is an illegal
integer type cast to a legal vector type. This should cover more
situations, but this is the first one I noticed.

---

Patch is 156.41 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/125637.diff


12 Files Affected:

- (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (+34-1) 
- (modified) llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll (-160) 
- (modified) 
llvm/test/CodeGen/AMDGPU/buffer-fat-pointers-contents-legalization.ll (-9) 
- (modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+54-274) 
- (modified) llvm/test/CodeGen/AMDGPU/kernel-args.ll (+122-611) 
- (modified) llvm/test/CodeGen/AMDGPU/load-constant-i16.ll (+17-23) 
- (modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+195-1105) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+34-45) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i8.ll (+16-32) 
- (modified) llvm/test/CodeGen/AMDGPU/min.ll (+75-231) 
- (modified) llvm/test/CodeGen/AMDGPU/shl.ll (+13-46) 
- (modified) llvm/test/CodeGen/AMDGPU/sra.ll (+14-53) 


``diff
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp 
b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index 95fb8b406e51bf..eb0c5faa7fe1eb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -2202,9 +2202,42 @@ SDValue 
DAGTypeLegalizer::PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N) {
 }
 
 SDValue DAGTypeLegalizer::PromoteIntOp_BITCAST(SDNode *N) {
+  EVT OutVT = N->getValueType(0);
+  SDValue InOp = N->getOperand(0);
+  EVT InVT = InOp.getValueType();
+  EVT NInVT = TLI.getTypeToTransformTo(*DAG.getContext(), InVT);
+  SDLoc dl(N);
+
+  switch (getTypeAction(InVT)) {
+  case TargetLowering::TypePromoteInteger: {
+if (OutVT.isVector()) {
+  EVT EltVT = OutVT.getVectorElementType();
+  TypeSize EltSize = EltVT.getSizeInBits();
+  TypeSize NInSize = NInVT.getSizeInBits();
+
+  if (NInSize.hasKnownScalarFactor(EltSize)) {
+unsigned NumEltsWithPadding = NInSize.getKnownScalarFactor(EltSize);
+EVT WideVecVT =
+EVT::getVectorVT(*DAG.getContext(), EltVT, NumEltsWithPadding);
+
+if (isTypeLegal(WideVecVT)) {
+  SDValue Promoted = GetPromotedInteger(InOp);
+  SDValue Cast = DAG.getNode(ISD::BITCAST, dl, WideVecVT, Promoted);
+  return DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, OutVT, Cast,
+ DAG.getVectorIdxConstant(0, dl));
+}
+  }
+}
+
+break;
+  }
+  default:
+break;
+  }
+
   // This should only occur in unusual situations like bitcasting to an
   // x86_fp80, so just turn it into a store+load
-  return CreateStackStoreLoad(N->getOperand(0), N->getValueType(0));
+  return CreateStackStoreLoad(InOp, OutVT);
 }
 
 SDValue DAGTypeLegalizer::PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo) {
diff --git a/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll 
b/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll
index ab89bb293f6e6e..2c6aabec763306 100644
--- a/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll
+++ b/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll
@@ -80,15 +80,6 @@ define <5 x i32> @bitcast_i160_to_v5i32(i160 %int) {
 ; GFX9-LABEL: bitcast_i160_to_v5i32:
 ; GFX9:   ; %bb.0:
 ; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX9-NEXT:s_mov_b32 s4, s33
-; GFX9-NEXT:s_add_i32 s33, s32, 0x7c0
-; GFX9-NEXT:s_and_b32 s33, s33, 0xf800
-; GFX9-NEXT:s_mov_b32 s5, s34
-; GFX9-NEXT:s_mov_b32 s34, s32
-; GFX9-NEXT:s_addk_i32 s32, 0x1000
-; GFX9-NEXT:s_mov_b32 s32, s34
-; GFX9-NEXT:s_mov_b32 s34, s5
-; GFX9-NEXT:s_mov_b32 s33, s4
 ; GFX9-NEXT:s_setpc_b64 s[30:31]
 ;
 ; GFX12-LABEL: bitcast_i160_to_v5i32:
@@ -98,23 +89,6 @@ define <5 x i32> @bitcast_i160_to_v5i32(i160 %int) {
 ; GFX12-NEXT:s_wait_samplecnt 0x0
 ; GFX12-NEXT:s_wait_bvhcnt 0x0
 ; GFX12-NEXT:s_wait_kmcnt 0x0
-; GFX12-NEXT:s_mov_b32 s0, s33
-; GFX12-NEXT:s_add_co_i32 s33, s32, 31
-; GFX12-NEXT:s_mov_b32 s1, s34
-; GFX12-NEXT:s_wait_alu 0xfffe
-; GFX12-NEXT:s_and_not1_b32 s33, s33, 31
-; GFX12-NEXT:s_clause 0x1
-; GFX12-NEXT:scratch_store_b64 off, v[2:3], s33 offset:8
-; GFX12-NEXT:scratch_store_b64 off, v[0:1], s33
-; GFX12-NEXT:scratch_load_b128 v[0:3], off, s33
-; GFX12-NEXT:s_mov_b32 s34, s32
-; GFX12-NEXT:s_add_co_i32 s32, s32, 64
-; GFX12-NEXT:s_wait_alu 0xfffe
-; GFX12-NEXT:s_mov_b32 s32, s34
-; GFX12-NEXT:s_mov_b32 s34, s1
-; GFX12-NEXT:s_mov_b32 s33, s0
-; GFX12-NEXT:s_wait_loadcnt 0x0
-; GFX12-NEXT:s_wait_alu 0xfffe
 ; GFX12-NEXT:s_setpc_b64 s[30:31]
   %bitcast = bitcast i160 %int to <5 x i32>
   ret <5 x i32> %bitcast
@@ -124,15 +98,6 @@ define <6 x i3

[llvm-branch-commits] [llvm] DAG: Avoid stack usage in bitcast operand promotion to legal vector (PR #125637)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/125637
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)

2025-02-03 Thread Alexandros Lamprineas via llvm-branch-commits

https://github.com/labrinea created 
https://github.com/llvm/llvm-project/pull/125525

None

>From 1e9a503b62b690e4615979e1363d17dd3adffca4 Mon Sep 17 00:00:00 2001
From: Alexandros Lamprineas 
Date: Mon, 3 Feb 2025 15:57:41 +
Subject: [PATCH] [FMV][AArch64] Release notes for LLVM20.

---
 clang/docs/ReleaseNotes.rst |  7 +++
 llvm/docs/ReleaseNotes.md   | 14 ++
 2 files changed, 21 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 53534d821b2c9a..b23963c8e611a1 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -654,6 +654,10 @@ Attribute Changes in Clang
 
 - The ``target_version`` attribute is now only supported for AArch64 and 
RISC-V architectures.
 
+- When targeting AArch64, a function declaration annotated with 
``target_version("default")``
+  now generates a mangled default version of the function, whereas before at 
least one more
+  version other than the default was required to trigger Function Multi 
Versioning.
+
 - Clang now permits the usage of the placement new operator in 
``[[msvc::constexpr]]``
   context outside of the std namespace. (#GH74924)
 
@@ -1188,6 +1192,9 @@ Arm and AArch64 Support
 
   * FUJITSU-MONAKA (fujitsu-monaka)
 
+- Runtime detection of depended-on Function Multi Versioning features has been 
added
+  in accordance with the Arm C Language Extensions (ACLE).
+
 Android Support
 ^^^
 
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index e0acb8f48c5b94..db9a681ebe2bc5 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -130,6 +130,10 @@ Changes to building LLVM
 Changes to TableGen
 ---
 
+* The ARMTargetDefEmitter now binds Funtion Multi Versioning features to the
+  corresponding AArch64 Architecture Extensions such that their dependencies
+  can be autogenerated using TableGen.
+
 Changes to Interprocedural Optimizations
 
 
@@ -431,9 +435,19 @@ Changes to the C API
 Changes to the CodeGen infrastructure
 -
 
+* GlobalOpt can now statically resolve calls to multi-versioned functions when 
targeting AArch64.
+  These calls would otherwise be routed through an IFunc resolver function. 
This optimization
+  can be applied when the caller is either a multi-versioned function itself, 
or it is compiled
+  with a sufficiently high set of architecture features (including the 
`target` attribute, and
+  command line options).
+
 Changes to the Metadata Info
 -
 
+* Multi-versioned functions targeting AArch64 are annotated with new metadata 
named `fmv-features`.
+  The metadata string value consists of a comma-separated list of Function 
Multi Versioning feature
+  names as defined in the Arm C Language Extensions (ACLE).
+
 Changes to the Debug Info
 -
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)

2025-02-03 Thread Alexandros Lamprineas via llvm-branch-commits

https://github.com/labrinea milestoned 
https://github.com/llvm/llvm-project/pull/125525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Alexandros Lamprineas (labrinea)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/125525.diff


2 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+7) 
- (modified) llvm/docs/ReleaseNotes.md (+14) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 53534d821b2c9a9..b23963c8e611a1a 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -654,6 +654,10 @@ Attribute Changes in Clang
 
 - The ``target_version`` attribute is now only supported for AArch64 and 
RISC-V architectures.
 
+- When targeting AArch64, a function declaration annotated with 
``target_version("default")``
+  now generates a mangled default version of the function, whereas before at 
least one more
+  version other than the default was required to trigger Function Multi 
Versioning.
+
 - Clang now permits the usage of the placement new operator in 
``[[msvc::constexpr]]``
   context outside of the std namespace. (#GH74924)
 
@@ -1188,6 +1192,9 @@ Arm and AArch64 Support
 
   * FUJITSU-MONAKA (fujitsu-monaka)
 
+- Runtime detection of depended-on Function Multi Versioning features has been 
added
+  in accordance with the Arm C Language Extensions (ACLE).
+
 Android Support
 ^^^
 
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index e0acb8f48c5b940..db9a681ebe2bc57 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -130,6 +130,10 @@ Changes to building LLVM
 Changes to TableGen
 ---
 
+* The ARMTargetDefEmitter now binds Funtion Multi Versioning features to the
+  corresponding AArch64 Architecture Extensions such that their dependencies
+  can be autogenerated using TableGen.
+
 Changes to Interprocedural Optimizations
 
 
@@ -431,9 +435,19 @@ Changes to the C API
 Changes to the CodeGen infrastructure
 -
 
+* GlobalOpt can now statically resolve calls to multi-versioned functions when 
targeting AArch64.
+  These calls would otherwise be routed through an IFunc resolver function. 
This optimization
+  can be applied when the caller is either a multi-versioned function itself, 
or it is compiled
+  with a sufficiently high set of architecture features (including the 
`target` attribute, and
+  command line options).
+
 Changes to the Metadata Info
 -
 
+* Multi-versioned functions targeting AArch64 are annotated with new metadata 
named `fmv-features`.
+  The metadata string value consists of a comma-separated list of Function 
Multi Versioning feature
+  names as defined in the Arm C Language Extensions (ACLE).
+
 Changes to the Debug Info
 -
 

``




https://github.com/llvm/llvm-project/pull/125525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/125535

>From e7b88d2c349059c01ddf463bf014a0c66d7c3b7e Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 23 Jan 2025 14:39:10 +0700
Subject: [PATCH] AMDGPU: Use default shouldRewriteCopySrc

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.
---
 llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp |  24 -
 llvm/lib/Target/AMDGPU/SIRegisterInfo.h   |   5 -
 llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll |  44 +-
 llvm/test/CodeGen/AMDGPU/ctpop64.ll   |  36 +-
 llvm/test/CodeGen/AMDGPU/idot2.ll | 182 +++
 llvm/test/CodeGen/AMDGPU/load-global-i32.ll   |  85 ++--
 .../peephole-opt-fold-reg-sequence-subreg.mir |   8 +-
 .../AMDGPU/peephole-opt-regseq-removal.mir|   4 +-
 .../CodeGen/AMDGPU/spill-scavenge-offset.ll   | 476 +-
 9 files changed, 418 insertions(+), 446 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 6fc57dec6a8264..71c720ed09b5fb 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned 
OpType) const {
  OpType <= AMDGPU::OPERAND_SRC_LAST;
 }
 
-bool SIRegisterInfo::shouldRewriteCopySrc(
-  const TargetRegisterClass *DefRC,
-  unsigned DefSubReg,
-  const TargetRegisterClass *SrcRC,
-  unsigned SrcSubReg) const {
-  // We want to prefer the smallest register class possible, so we don't want 
to
-  // stop and rewrite on anything that looks like a subregister
-  // extract. Operations mostly don't care about the super register class, so 
we
-  // only want to stop on the most basic of copies between the same register
-  // class.
-  //
-  // e.g. if we have something like
-  // %0 = ...
-  // %1 = ...
-  // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2
-  // %3 = COPY %2, sub0
-  //
-  // We want to look through the COPY to find:
-  //  => %3 = COPY %0
-
-  // Plain copy.
-  return getCommonSubClass(DefRC, SrcRC) != nullptr;
-}
-
 bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const {
   // TODO: 64-bit operands have extending behavior from 32-bit literal.
   return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST &&
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 8e481e3ac23043..a434efb70d0525 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
const TargetRegisterClass *SubRC,
unsigned SubIdx) const;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-unsigned DefSubReg,
-const TargetRegisterClass *SrcRC,
-unsigned SrcSubReg) const override;
-
   /// \returns True if operands defined with this operand type can accept
   /// a literal constant (i.e. any 32-bit immediate).
   bool opCanUseLiteralConstant(unsigned OpType) const;
diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll 
b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
index c6c0b9cf8f027f..cc2f775ff22bc5 100644
--- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
+++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
@@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr 
addrspace(1) %out0, ptr addrspa
 define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr 
addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr 
addrspace(1) %in) nounwind {
 ; SI-LABEL: test_copy_v4i8_x4:
 ; SI:   ; %bb.0:
-; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11
-; SI-NEXT:s_mov_b32 s3, 0xf000
-; SI-NEXT:s_mov_b32 s10, 0
-; SI-NEXT:s_mov_b32 s11, s3
+; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11
+; SI-NEXT:s_mov_b32 s11, 0xf000
+; SI-NEXT:s_mov_b32 s2, 0
+; SI-NEXT:s_mov_b32 s3, s11
 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; SI-NEXT:v_mov_b32_e32 v1, 0
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
-; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9
-; SI-NEXT:s_mov_b32 s2, -1
-; SI-NEXT:s_mov_b32 s14, s2
-; SI-NEXT:s_mov_b32 s15, s3
-; SI-NEXT:s_mov_b32 s18, s2
+; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64
+; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9
+; SI-NEXT:s_mov_b32 s10, -1
+; SI-NEXT:s_mov_b32 s14, s10
+; SI-NEXT:s_mov_b32 s15, s11
+; SI-NEXT:s_mov_b32 s18, s10
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:s_mov_b32 s0, s4
-; SI-NEXT:s_mov_b32 s1, s5
-; SI-NEXT:s_mov_b32 s19, s3
-; SI-NEXT:s_mov_b32 s22, s2
-; SI-NEXT:s_mov_b32 s23, s3
-; SI-NEXT:s_mov_b32 s12

[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/125535

>From e7b88d2c349059c01ddf463bf014a0c66d7c3b7e Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 23 Jan 2025 14:39:10 +0700
Subject: [PATCH] AMDGPU: Use default shouldRewriteCopySrc

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.
---
 llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp |  24 -
 llvm/lib/Target/AMDGPU/SIRegisterInfo.h   |   5 -
 llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll |  44 +-
 llvm/test/CodeGen/AMDGPU/ctpop64.ll   |  36 +-
 llvm/test/CodeGen/AMDGPU/idot2.ll | 182 +++
 llvm/test/CodeGen/AMDGPU/load-global-i32.ll   |  85 ++--
 .../peephole-opt-fold-reg-sequence-subreg.mir |   8 +-
 .../AMDGPU/peephole-opt-regseq-removal.mir|   4 +-
 .../CodeGen/AMDGPU/spill-scavenge-offset.ll   | 476 +-
 9 files changed, 418 insertions(+), 446 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 6fc57dec6a8264..71c720ed09b5fb 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned 
OpType) const {
  OpType <= AMDGPU::OPERAND_SRC_LAST;
 }
 
-bool SIRegisterInfo::shouldRewriteCopySrc(
-  const TargetRegisterClass *DefRC,
-  unsigned DefSubReg,
-  const TargetRegisterClass *SrcRC,
-  unsigned SrcSubReg) const {
-  // We want to prefer the smallest register class possible, so we don't want 
to
-  // stop and rewrite on anything that looks like a subregister
-  // extract. Operations mostly don't care about the super register class, so 
we
-  // only want to stop on the most basic of copies between the same register
-  // class.
-  //
-  // e.g. if we have something like
-  // %0 = ...
-  // %1 = ...
-  // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2
-  // %3 = COPY %2, sub0
-  //
-  // We want to look through the COPY to find:
-  //  => %3 = COPY %0
-
-  // Plain copy.
-  return getCommonSubClass(DefRC, SrcRC) != nullptr;
-}
-
 bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const {
   // TODO: 64-bit operands have extending behavior from 32-bit literal.
   return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST &&
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 8e481e3ac23043..a434efb70d0525 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
const TargetRegisterClass *SubRC,
unsigned SubIdx) const;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-unsigned DefSubReg,
-const TargetRegisterClass *SrcRC,
-unsigned SrcSubReg) const override;
-
   /// \returns True if operands defined with this operand type can accept
   /// a literal constant (i.e. any 32-bit immediate).
   bool opCanUseLiteralConstant(unsigned OpType) const;
diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll 
b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
index c6c0b9cf8f027f..cc2f775ff22bc5 100644
--- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
+++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
@@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr 
addrspace(1) %out0, ptr addrspa
 define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr 
addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr 
addrspace(1) %in) nounwind {
 ; SI-LABEL: test_copy_v4i8_x4:
 ; SI:   ; %bb.0:
-; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11
-; SI-NEXT:s_mov_b32 s3, 0xf000
-; SI-NEXT:s_mov_b32 s10, 0
-; SI-NEXT:s_mov_b32 s11, s3
+; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11
+; SI-NEXT:s_mov_b32 s11, 0xf000
+; SI-NEXT:s_mov_b32 s2, 0
+; SI-NEXT:s_mov_b32 s3, s11
 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; SI-NEXT:v_mov_b32_e32 v1, 0
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
-; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9
-; SI-NEXT:s_mov_b32 s2, -1
-; SI-NEXT:s_mov_b32 s14, s2
-; SI-NEXT:s_mov_b32 s15, s3
-; SI-NEXT:s_mov_b32 s18, s2
+; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64
+; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9
+; SI-NEXT:s_mov_b32 s10, -1
+; SI-NEXT:s_mov_b32 s14, s10
+; SI-NEXT:s_mov_b32 s15, s11
+; SI-NEXT:s_mov_b32 s18, s10
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:s_mov_b32 s0, s4
-; SI-NEXT:s_mov_b32 s1, s5
-; SI-NEXT:s_mov_b32 s19, s3
-; SI-NEXT:s_mov_b32 s22, s2
-; SI-NEXT:s_mov_b32 s23, s3
-; SI-NEXT:s_mov_b32 s12

[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread via llvm-branch-commits

https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/123147

>From 635b27a0842aa38d6a1c731bee72de0b547b7638 Mon Sep 17 00:00:00 2001
From: joaosaffran 
Date: Wed, 15 Jan 2025 17:30:00 +
Subject: [PATCH 01/16] adding metadata extraction

---
 .../llvm/Analysis/DXILMetadataAnalysis.h  |  3 +
 llvm/lib/Analysis/DXILMetadataAnalysis.cpp| 89 +++
 .../lib/Target/DirectX/DXContainerGlobals.cpp | 24 +
 3 files changed, 116 insertions(+)

diff --git a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h 
b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
index cb535ac14f1c613..f420244ba111a45 100644
--- a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
+++ b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
@@ -11,9 +11,11 @@
 
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/IR/PassManager.h"
+#include "llvm/MC/DXContainerRootSignature.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/VersionTuple.h"
 #include "llvm/TargetParser/Triple.h"
+#include 
 
 namespace llvm {
 
@@ -37,6 +39,7 @@ struct ModuleMetadataInfo {
   Triple::EnvironmentType ShaderProfile{Triple::UnknownEnvironment};
   VersionTuple ValidatorVersion{};
   SmallVector EntryPropertyVec{};
+  std::optional RootSignatureDesc;
   void print(raw_ostream &OS) const;
 };
 
diff --git a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp 
b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
index a7f666a3f8b48f2..388e3853008eaec 100644
--- a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
+++ b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
@@ -15,12 +15,91 @@
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
 #include "llvm/InitializePasses.h"
+#include "llvm/MC/DXContainerRootSignature.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
+#include 
 
 #define DEBUG_TYPE "dxil-metadata-analysis"
 
 using namespace llvm;
 using namespace dxil;
+using namespace llvm::mcdxbc;
+
+static bool parseRootFlags(MDNode *RootFlagNode, RootSignatureDesc *Desc) {
+
+  assert(RootFlagNode->getNumOperands() == 2 &&
+ "Invalid format for RootFlag Element");
+  auto *Flag = mdconst::extract(RootFlagNode->getOperand(1));
+  auto Value = (RootSignatureFlags)Flag->getZExtValue();
+
+  if ((Value & ~RootSignatureFlags::ValidFlags) != RootSignatureFlags::None)
+return true;
+
+  Desc->Flags = Value;
+  return false;
+}
+
+static bool parseRootSignatureElement(MDNode *Element,
+  RootSignatureDesc *Desc) {
+  MDString *ElementText = cast(Element->getOperand(0));
+
+  assert(ElementText != nullptr && "First preoperty of element is not ");
+
+  RootSignatureElementKind ElementKind =
+  StringSwitch(ElementText->getString())
+  .Case("RootFlags", RootSignatureElementKind::RootFlags)
+  .Case("RootConstants", RootSignatureElementKind::RootConstants)
+  .Case("RootCBV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootSRV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootUAV", RootSignatureElementKind::RootDescriptor)
+  .Case("Sampler", RootSignatureElementKind::RootDescriptor)
+  .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable)
+  .Case("StaticSampler", RootSignatureElementKind::StaticSampler)
+  .Default(RootSignatureElementKind::None);
+
+  switch (ElementKind) {
+
+  case RootSignatureElementKind::RootFlags: {
+return parseRootFlags(Element, Desc);
+break;
+  }
+
+  case RootSignatureElementKind::RootConstants:
+  case RootSignatureElementKind::RootDescriptor:
+  case RootSignatureElementKind::DescriptorTable:
+  case RootSignatureElementKind::StaticSampler:
+  case RootSignatureElementKind::None:
+llvm_unreachable("Not Implemented yet");
+break;
+  }
+
+  return true;
+}
+
+bool parseRootSignature(RootSignatureDesc *Desc, int32_t Version,
+NamedMDNode *Root) {
+  Desc->Version = Version;
+  bool HasError = false;
+
+  for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) {
+// This should be an if, for error handling
+MDNode *Node = cast(Root->getOperand(Sid));
+
+// Not sure what use this for...
+Metadata *Func = Node->getOperand(0).get();
+
+// This should be an if, for error handling
+MDNode *Elements = cast(Node->getOperand(1).get());
+
+for (unsigned int Eid = 0; Eid < Elements->getNumOperands(); Eid++) {
+  MDNode *Element = cast(Elements->getOperand(Eid));
+
+  HasError = HasError || parseRootSignatureElement(Element, Desc);
+}
+  }
+  return HasError;
+}
 
 static ModuleMetadataInfo collectMetadataInfo(Module &M) {
   ModuleMetadataInfo MMDAI;
@@ -28,6 +107,7 @@ static ModuleMetadataInfo collectMetadataInfo(Module &M) {
   MMDAI.DXILVersion = TT.getDXILVersion();
   MMDAI.ShaderModelVersion = TT.getOSVersion();
   MMDAI.ShaderProfile = TT.getEnvironment();
+
   NamedMDNode *ValidatorVerNode = M.getNamedMetadata("dx.valver");
   if (ValidatorVerNode) {

[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)

2025-02-03 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/125588
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)

2025-02-03 Thread via llvm-branch-commits

github-actions[bot] wrote:

@tstellar (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/125588
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/125585


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot closed 
https://github.com/llvm/llvm-project/pull/125585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread via llvm-branch-commits

https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/123147

>From 635b27a0842aa38d6a1c731bee72de0b547b7638 Mon Sep 17 00:00:00 2001
From: joaosaffran 
Date: Wed, 15 Jan 2025 17:30:00 +
Subject: [PATCH 01/17] adding metadata extraction

---
 .../llvm/Analysis/DXILMetadataAnalysis.h  |  3 +
 llvm/lib/Analysis/DXILMetadataAnalysis.cpp| 89 +++
 .../lib/Target/DirectX/DXContainerGlobals.cpp | 24 +
 3 files changed, 116 insertions(+)

diff --git a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h 
b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
index cb535ac14f1c61..f420244ba111a4 100644
--- a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
+++ b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
@@ -11,9 +11,11 @@
 
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/IR/PassManager.h"
+#include "llvm/MC/DXContainerRootSignature.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/VersionTuple.h"
 #include "llvm/TargetParser/Triple.h"
+#include 
 
 namespace llvm {
 
@@ -37,6 +39,7 @@ struct ModuleMetadataInfo {
   Triple::EnvironmentType ShaderProfile{Triple::UnknownEnvironment};
   VersionTuple ValidatorVersion{};
   SmallVector EntryPropertyVec{};
+  std::optional RootSignatureDesc;
   void print(raw_ostream &OS) const;
 };
 
diff --git a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp 
b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
index a7f666a3f8b48f..388e3853008eae 100644
--- a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
+++ b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
@@ -15,12 +15,91 @@
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
 #include "llvm/InitializePasses.h"
+#include "llvm/MC/DXContainerRootSignature.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
+#include 
 
 #define DEBUG_TYPE "dxil-metadata-analysis"
 
 using namespace llvm;
 using namespace dxil;
+using namespace llvm::mcdxbc;
+
+static bool parseRootFlags(MDNode *RootFlagNode, RootSignatureDesc *Desc) {
+
+  assert(RootFlagNode->getNumOperands() == 2 &&
+ "Invalid format for RootFlag Element");
+  auto *Flag = mdconst::extract(RootFlagNode->getOperand(1));
+  auto Value = (RootSignatureFlags)Flag->getZExtValue();
+
+  if ((Value & ~RootSignatureFlags::ValidFlags) != RootSignatureFlags::None)
+return true;
+
+  Desc->Flags = Value;
+  return false;
+}
+
+static bool parseRootSignatureElement(MDNode *Element,
+  RootSignatureDesc *Desc) {
+  MDString *ElementText = cast(Element->getOperand(0));
+
+  assert(ElementText != nullptr && "First preoperty of element is not ");
+
+  RootSignatureElementKind ElementKind =
+  StringSwitch(ElementText->getString())
+  .Case("RootFlags", RootSignatureElementKind::RootFlags)
+  .Case("RootConstants", RootSignatureElementKind::RootConstants)
+  .Case("RootCBV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootSRV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootUAV", RootSignatureElementKind::RootDescriptor)
+  .Case("Sampler", RootSignatureElementKind::RootDescriptor)
+  .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable)
+  .Case("StaticSampler", RootSignatureElementKind::StaticSampler)
+  .Default(RootSignatureElementKind::None);
+
+  switch (ElementKind) {
+
+  case RootSignatureElementKind::RootFlags: {
+return parseRootFlags(Element, Desc);
+break;
+  }
+
+  case RootSignatureElementKind::RootConstants:
+  case RootSignatureElementKind::RootDescriptor:
+  case RootSignatureElementKind::DescriptorTable:
+  case RootSignatureElementKind::StaticSampler:
+  case RootSignatureElementKind::None:
+llvm_unreachable("Not Implemented yet");
+break;
+  }
+
+  return true;
+}
+
+bool parseRootSignature(RootSignatureDesc *Desc, int32_t Version,
+NamedMDNode *Root) {
+  Desc->Version = Version;
+  bool HasError = false;
+
+  for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) {
+// This should be an if, for error handling
+MDNode *Node = cast(Root->getOperand(Sid));
+
+// Not sure what use this for...
+Metadata *Func = Node->getOperand(0).get();
+
+// This should be an if, for error handling
+MDNode *Elements = cast(Node->getOperand(1).get());
+
+for (unsigned int Eid = 0; Eid < Elements->getNumOperands(); Eid++) {
+  MDNode *Element = cast(Elements->getOperand(Eid));
+
+  HasError = HasError || parseRootSignatureElement(Element, Desc);
+}
+  }
+  return HasError;
+}
 
 static ModuleMetadataInfo collectMetadataInfo(Module &M) {
   ModuleMetadataInfo MMDAI;
@@ -28,6 +107,7 @@ static ModuleMetadataInfo collectMetadataInfo(Module &M) {
   MMDAI.DXILVersion = TT.getDXILVersion();
   MMDAI.ShaderModelVersion = TT.getOSVersion();
   MMDAI.ShaderProfile = TT.getEnvironment();
+
   NamedMDNode *ValidatorVerNode = M.getNamedMetadata("dx.valver");
   if (ValidatorVerNode) {
   

[llvm-branch-commits] [clang] release/20.x: [AArch64] Enable vscale_range with +sme (#124466) (PR #125386)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/125386

>From d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b Mon Sep 17 00:00:00 2001
From: David Green 
Date: Fri, 31 Jan 2025 07:57:43 +
Subject: [PATCH] [AArch64] Enable vscale_range with +sme (#124466)

If we have +sme but not +sve, we would not set vscale_range on
functions. It should be valid to apply it with the same range with just
+sme, which can help mitigate some performance regressions in cases such
as scalable vector bitcasts (https://godbolt.org/z/exhe4jd8d).

(cherry picked from commit 9f1c825fb62319b94ac9604f733afd59e9eb461b)
---
 clang/include/clang/Basic/TargetInfo.h  |  3 ++-
 clang/lib/AST/ASTContext.cpp|  3 ++-
 clang/lib/AST/ItaniumMangle.cpp |  2 +-
 clang/lib/Basic/Targets/AArch64.cpp |  5 +++--
 clang/lib/Basic/Targets/AArch64.h   |  3 ++-
 clang/lib/Basic/Targets/RISCV.cpp   |  5 +++--
 clang/lib/Basic/Targets/RISCV.h |  3 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp   | 17 +
 clang/lib/CodeGen/Targets/RISCV.cpp |  4 ++--
 clang/lib/Sema/SemaType.cpp |  3 ++-
 .../sme-intrinsics/aarch64-sme-attrs.cpp|  4 ++--
 11 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/clang/include/clang/Basic/TargetInfo.h 
b/clang/include/clang/Basic/TargetInfo.h
index 43c09cf1f973e3c..d762144478b489d 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1023,7 +1023,8 @@ class TargetInfo : public TransferrableTargetInfo,
 
   /// Returns target-specific min and max values VScale_Range.
   virtual std::optional>
-  getVScaleRange(const LangOptions &LangOpts) const {
+  getVScaleRange(const LangOptions &LangOpts,
+ bool IsArmStreamingFunction) const {
 return std::nullopt;
   }
   /// The __builtin_clz* and __builtin_ctz* built-in
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index cd1bcb3b9a063d8..e58091ce95f6258 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -10363,7 +10363,8 @@ bool ASTContext::areLaxCompatibleSveTypes(QualType 
FirstType,
 /// getRVVTypeSize - Return RVV vector register size.
 static uint64_t getRVVTypeSize(ASTContext &Context, const BuiltinType *Ty) {
   assert(Ty->isRVVVLSBuiltinType() && "Invalid RVV Type");
-  auto VScale = Context.getTargetInfo().getVScaleRange(Context.getLangOpts());
+  auto VScale =
+  Context.getTargetInfo().getVScaleRange(Context.getLangOpts(), false);
   if (!VScale)
 return 0;
 
diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 49089c0ea3c8ac1..f84ccefd34cacbe 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -4198,7 +4198,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const 
VectorType *T) {
 
   // Apend the LMUL suffix.
   auto VScale = getASTContext().getTargetInfo().getVScaleRange(
-  getASTContext().getLangOpts());
+  getASTContext().getLangOpts(), false);
   unsigned VLen = VScale->first * llvm::RISCV::RVVBitsPerBlock;
 
   if (T->getVectorKind() == VectorKind::RVVFixedLengthData) {
diff --git a/clang/lib/Basic/Targets/AArch64.cpp 
b/clang/lib/Basic/Targets/AArch64.cpp
index 0b899137bbb5c74..57c9849ef2a7287 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -703,12 +703,13 @@ ArrayRef 
AArch64TargetInfo::getTargetBuiltins() const {
 }
 
 std::optional>
-AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts) const {
+AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts,
+  bool IsArmStreamingFunction) const {
   if (LangOpts.VScaleMin || LangOpts.VScaleMax)
 return std::pair(
 LangOpts.VScaleMin ? LangOpts.VScaleMin : 1, LangOpts.VScaleMax);
 
-  if (hasFeature("sve"))
+  if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme")))
 return std::pair(1, 16);
 
   return std::nullopt;
diff --git a/clang/lib/Basic/Targets/AArch64.h 
b/clang/lib/Basic/Targets/AArch64.h
index 600940f5e4e23c1..b75d2a9dc8ecadc 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -184,7 +184,8 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public 
TargetInfo {
   ArrayRef getTargetBuiltins() const override;
 
   std::optional>
-  getVScaleRange(const LangOptions &LangOpts) const override;
+  getVScaleRange(const LangOptions &LangOpts,
+ bool IsArmStreamingFunction) const override;
   bool doesFeatureAffectCodeGen(StringRef Name) const override;
   bool validateCpuSupports(StringRef FeatureStr) const override;
   bool hasFeature(StringRef Feature) const override;
diff --git a/clang/lib/Basic/Targets/RISCV.cpp 
b/clang/lib/Basic/Targets/RISCV.cpp
index 8167d7603b0e143..61b8ae9d098abc0 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/cl

[llvm-branch-commits] [clang] d185bd9 - [AArch64] Enable vscale_range with +sme (#124466)

2025-02-03 Thread Tom Stellard via llvm-branch-commits

Author: David Green
Date: 2025-02-03T17:32:53-08:00
New Revision: d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b

URL: 
https://github.com/llvm/llvm-project/commit/d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b
DIFF: 
https://github.com/llvm/llvm-project/commit/d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b.diff

LOG: [AArch64] Enable vscale_range with +sme (#124466)

If we have +sme but not +sve, we would not set vscale_range on
functions. It should be valid to apply it with the same range with just
+sme, which can help mitigate some performance regressions in cases such
as scalable vector bitcasts (https://godbolt.org/z/exhe4jd8d).

(cherry picked from commit 9f1c825fb62319b94ac9604f733afd59e9eb461b)

Added: 


Modified: 
clang/include/clang/Basic/TargetInfo.h
clang/lib/AST/ASTContext.cpp
clang/lib/AST/ItaniumMangle.cpp
clang/lib/Basic/Targets/AArch64.cpp
clang/lib/Basic/Targets/AArch64.h
clang/lib/Basic/Targets/RISCV.cpp
clang/lib/Basic/Targets/RISCV.h
clang/lib/CodeGen/CodeGenFunction.cpp
clang/lib/CodeGen/Targets/RISCV.cpp
clang/lib/Sema/SemaType.cpp
clang/test/CodeGen/AArch64/sme-intrinsics/aarch64-sme-attrs.cpp

Removed: 




diff  --git a/clang/include/clang/Basic/TargetInfo.h 
b/clang/include/clang/Basic/TargetInfo.h
index 43c09cf1f973e3..d762144478b489 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1023,7 +1023,8 @@ class TargetInfo : public TransferrableTargetInfo,
 
   /// Returns target-specific min and max values VScale_Range.
   virtual std::optional>
-  getVScaleRange(const LangOptions &LangOpts) const {
+  getVScaleRange(const LangOptions &LangOpts,
+ bool IsArmStreamingFunction) const {
 return std::nullopt;
   }
   /// The __builtin_clz* and __builtin_ctz* built-in

diff  --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index cd1bcb3b9a063d..e58091ce95f625 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -10363,7 +10363,8 @@ bool ASTContext::areLaxCompatibleSveTypes(QualType 
FirstType,
 /// getRVVTypeSize - Return RVV vector register size.
 static uint64_t getRVVTypeSize(ASTContext &Context, const BuiltinType *Ty) {
   assert(Ty->isRVVVLSBuiltinType() && "Invalid RVV Type");
-  auto VScale = Context.getTargetInfo().getVScaleRange(Context.getLangOpts());
+  auto VScale =
+  Context.getTargetInfo().getVScaleRange(Context.getLangOpts(), false);
   if (!VScale)
 return 0;
 

diff  --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 49089c0ea3c8ac..f84ccefd34cacb 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -4198,7 +4198,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const 
VectorType *T) {
 
   // Apend the LMUL suffix.
   auto VScale = getASTContext().getTargetInfo().getVScaleRange(
-  getASTContext().getLangOpts());
+  getASTContext().getLangOpts(), false);
   unsigned VLen = VScale->first * llvm::RISCV::RVVBitsPerBlock;
 
   if (T->getVectorKind() == VectorKind::RVVFixedLengthData) {

diff  --git a/clang/lib/Basic/Targets/AArch64.cpp 
b/clang/lib/Basic/Targets/AArch64.cpp
index 0b899137bbb5c7..57c9849ef2a728 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -703,12 +703,13 @@ ArrayRef 
AArch64TargetInfo::getTargetBuiltins() const {
 }
 
 std::optional>
-AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts) const {
+AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts,
+  bool IsArmStreamingFunction) const {
   if (LangOpts.VScaleMin || LangOpts.VScaleMax)
 return std::pair(
 LangOpts.VScaleMin ? LangOpts.VScaleMin : 1, LangOpts.VScaleMax);
 
-  if (hasFeature("sve"))
+  if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme")))
 return std::pair(1, 16);
 
   return std::nullopt;

diff  --git a/clang/lib/Basic/Targets/AArch64.h 
b/clang/lib/Basic/Targets/AArch64.h
index 600940f5e4e23c..b75d2a9dc8ecad 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -184,7 +184,8 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public 
TargetInfo {
   ArrayRef getTargetBuiltins() const override;
 
   std::optional>
-  getVScaleRange(const LangOptions &LangOpts) const override;
+  getVScaleRange(const LangOptions &LangOpts,
+ bool IsArmStreamingFunction) const override;
   bool doesFeatureAffectCodeGen(StringRef Name) const override;
   bool validateCpuSupports(StringRef FeatureStr) const override;
   bool hasFeature(StringRef Feature) const override;

diff  --git a/clang/lib/Basic/Targets/RISCV.cpp 
b/clang/lib/Basic/Targets/RISCV.cpp
index 8167d7603b0e14..61b8ae9d098abc 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -222,7 +222,7 @@ void RISCVTargetInfo::g

[llvm-branch-commits] [clang] release/20.x: [AArch64] Enable vscale_range with +sme (#124466) (PR #125386)

2025-02-03 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/125386
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)

2025-02-03 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 approved this pull request.


https://github.com/llvm/llvm-project/pull/125588
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 approved this pull request.


https://github.com/llvm/llvm-project/pull/125585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:

@boomanaiden154 What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/125585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread via llvm-branch-commits

https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/123147

>From 635b27a0842aa38d6a1c731bee72de0b547b7638 Mon Sep 17 00:00:00 2001
From: joaosaffran 
Date: Wed, 15 Jan 2025 17:30:00 +
Subject: [PATCH 01/15] adding metadata extraction

---
 .../llvm/Analysis/DXILMetadataAnalysis.h  |  3 +
 llvm/lib/Analysis/DXILMetadataAnalysis.cpp| 89 +++
 .../lib/Target/DirectX/DXContainerGlobals.cpp | 24 +
 3 files changed, 116 insertions(+)

diff --git a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h 
b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
index cb535ac14f1c61..f420244ba111a4 100644
--- a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
+++ b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h
@@ -11,9 +11,11 @@
 
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/IR/PassManager.h"
+#include "llvm/MC/DXContainerRootSignature.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/VersionTuple.h"
 #include "llvm/TargetParser/Triple.h"
+#include 
 
 namespace llvm {
 
@@ -37,6 +39,7 @@ struct ModuleMetadataInfo {
   Triple::EnvironmentType ShaderProfile{Triple::UnknownEnvironment};
   VersionTuple ValidatorVersion{};
   SmallVector EntryPropertyVec{};
+  std::optional RootSignatureDesc;
   void print(raw_ostream &OS) const;
 };
 
diff --git a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp 
b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
index a7f666a3f8b48f..388e3853008eae 100644
--- a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
+++ b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp
@@ -15,12 +15,91 @@
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
 #include "llvm/InitializePasses.h"
+#include "llvm/MC/DXContainerRootSignature.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
+#include 
 
 #define DEBUG_TYPE "dxil-metadata-analysis"
 
 using namespace llvm;
 using namespace dxil;
+using namespace llvm::mcdxbc;
+
+static bool parseRootFlags(MDNode *RootFlagNode, RootSignatureDesc *Desc) {
+
+  assert(RootFlagNode->getNumOperands() == 2 &&
+ "Invalid format for RootFlag Element");
+  auto *Flag = mdconst::extract(RootFlagNode->getOperand(1));
+  auto Value = (RootSignatureFlags)Flag->getZExtValue();
+
+  if ((Value & ~RootSignatureFlags::ValidFlags) != RootSignatureFlags::None)
+return true;
+
+  Desc->Flags = Value;
+  return false;
+}
+
+static bool parseRootSignatureElement(MDNode *Element,
+  RootSignatureDesc *Desc) {
+  MDString *ElementText = cast(Element->getOperand(0));
+
+  assert(ElementText != nullptr && "First preoperty of element is not ");
+
+  RootSignatureElementKind ElementKind =
+  StringSwitch(ElementText->getString())
+  .Case("RootFlags", RootSignatureElementKind::RootFlags)
+  .Case("RootConstants", RootSignatureElementKind::RootConstants)
+  .Case("RootCBV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootSRV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootUAV", RootSignatureElementKind::RootDescriptor)
+  .Case("Sampler", RootSignatureElementKind::RootDescriptor)
+  .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable)
+  .Case("StaticSampler", RootSignatureElementKind::StaticSampler)
+  .Default(RootSignatureElementKind::None);
+
+  switch (ElementKind) {
+
+  case RootSignatureElementKind::RootFlags: {
+return parseRootFlags(Element, Desc);
+break;
+  }
+
+  case RootSignatureElementKind::RootConstants:
+  case RootSignatureElementKind::RootDescriptor:
+  case RootSignatureElementKind::DescriptorTable:
+  case RootSignatureElementKind::StaticSampler:
+  case RootSignatureElementKind::None:
+llvm_unreachable("Not Implemented yet");
+break;
+  }
+
+  return true;
+}
+
+bool parseRootSignature(RootSignatureDesc *Desc, int32_t Version,
+NamedMDNode *Root) {
+  Desc->Version = Version;
+  bool HasError = false;
+
+  for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) {
+// This should be an if, for error handling
+MDNode *Node = cast(Root->getOperand(Sid));
+
+// Not sure what use this for...
+Metadata *Func = Node->getOperand(0).get();
+
+// This should be an if, for error handling
+MDNode *Elements = cast(Node->getOperand(1).get());
+
+for (unsigned int Eid = 0; Eid < Elements->getNumOperands(); Eid++) {
+  MDNode *Element = cast(Elements->getOperand(Eid));
+
+  HasError = HasError || parseRootSignatureElement(Element, Desc);
+}
+  }
+  return HasError;
+}
 
 static ModuleMetadataInfo collectMetadataInfo(Module &M) {
   ModuleMetadataInfo MMDAI;
@@ -28,6 +107,7 @@ static ModuleMetadataInfo collectMetadataInfo(Module &M) {
   MMDAI.DXILVersion = TT.getDXILVersion();
   MMDAI.ShaderModelVersion = TT.getOSVersion();
   MMDAI.ShaderProfile = TT.getEnvironment();
+
   NamedMDNode *ValidatorVerNode = M.getNamedMetadata("dx.valver");
   if (ValidatorVerNode) {
   

[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)

2025-02-03 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/125588

(cherry picked from commit 2deba08e09b9412c9f4e5888237e28173dee085b)

>From 34bae71660d86455c5a51ad00fec49129847bc1d Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Mon, 3 Feb 2025 13:20:37 -0800
Subject: [PATCH] workflows/premerge: Cancel in progress jobs when a PR is
 merged (#125329)

(cherry picked from commit 2deba08e09b9412c9f4e5888237e28173dee085b)
---
 .github/workflows/premerge.yaml | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/premerge.yaml b/.github/workflows/premerge.yaml
index c22b35e122b9fc..956760feaa3b52 100644
--- a/.github/workflows/premerge.yaml
+++ b/.github/workflows/premerge.yaml
@@ -5,6 +5,17 @@ permissions:
 
 on:
   pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+paths:
+  - .github/workflows/premerge.yaml
   push:
 branches:
   - 'main'
@@ -12,7 +23,9 @@ on:
 
 jobs:
   premerge-checks-linux:
-if: false && github.repository_owner == 'llvm'
+if: >-
+false && github.repository_owner == 'llvm' &&
+(github.event_name != 'pull_request' || github.event.action != 
'closed')
 runs-on: llvm-premerge-linux-runners
 concurrency:
   group: ${{ github.workflow }}-linux-${{ github.event.pull_request.number 
|| github.sha }}
@@ -71,7 +84,9 @@ jobs:
   ./.ci/monolithic-linux.sh "$(echo ${linux_projects} | tr ' ' ';')" 
"$(echo ${linux_check_targets})" "$(echo ${linux_runtimes} | tr ' ' ';')" 
"$(echo ${linux_runtime_check_targets})"
 
   premerge-checks-windows:
-if: false && github.repository_owner == 'llvm'
+if: >-
+false && github.repository_owner == 'llvm' &&
+(github.event_name != 'pull_request' || github.event.action != 
'closed')
 runs-on: llvm-premerge-windows-runners
 concurrency:
   group: ${{ github.workflow }}-windows-${{ 
github.event.pull_request.number || github.sha }}
@@ -139,7 +154,8 @@ jobs:
 if: >-
   github.repository_owner == 'llvm' &&
   (startswith(github.ref_name, 'release/') ||
-   startswith(github.base_ref, 'release/'))
+   startswith(github.base_ref, 'release/')) &&
+  (github.event_name != 'pull_request' || github.event.action != 'closed')
 steps:
   - name: Checkout LLVM
 uses: actions/checkout@v4

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)

2025-02-03 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar milestoned 
https://github.com/llvm/llvm-project/pull/125588
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-github-workflow

Author: Tom Stellard (tstellar)


Changes

(cherry picked from commit 2deba08e09b9412c9f4e5888237e28173dee085b)

---
Full diff: https://github.com/llvm/llvm-project/pull/125588.diff


1 Files Affected:

- (modified) .github/workflows/premerge.yaml (+19-3) 


``diff
diff --git a/.github/workflows/premerge.yaml b/.github/workflows/premerge.yaml
index c22b35e122b9fc..956760feaa3b52 100644
--- a/.github/workflows/premerge.yaml
+++ b/.github/workflows/premerge.yaml
@@ -5,6 +5,17 @@ permissions:
 
 on:
   pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+paths:
+  - .github/workflows/premerge.yaml
   push:
 branches:
   - 'main'
@@ -12,7 +23,9 @@ on:
 
 jobs:
   premerge-checks-linux:
-if: false && github.repository_owner == 'llvm'
+if: >-
+false && github.repository_owner == 'llvm' &&
+(github.event_name != 'pull_request' || github.event.action != 
'closed')
 runs-on: llvm-premerge-linux-runners
 concurrency:
   group: ${{ github.workflow }}-linux-${{ github.event.pull_request.number 
|| github.sha }}
@@ -71,7 +84,9 @@ jobs:
   ./.ci/monolithic-linux.sh "$(echo ${linux_projects} | tr ' ' ';')" 
"$(echo ${linux_check_targets})" "$(echo ${linux_runtimes} | tr ' ' ';')" 
"$(echo ${linux_runtime_check_targets})"
 
   premerge-checks-windows:
-if: false && github.repository_owner == 'llvm'
+if: >-
+false && github.repository_owner == 'llvm' &&
+(github.event_name != 'pull_request' || github.event.action != 
'closed')
 runs-on: llvm-premerge-windows-runners
 concurrency:
   group: ${{ github.workflow }}-windows-${{ 
github.event.pull_request.number || github.sha }}
@@ -139,7 +154,8 @@ jobs:
 if: >-
   github.repository_owner == 'llvm' &&
   (startswith(github.ref_name, 'release/') ||
-   startswith(github.base_ref, 'release/'))
+   startswith(github.base_ref, 'release/')) &&
+  (github.event_name != 'pull_request' || github.event.action != 'closed')
 steps:
   - name: Checkout LLVM
 uses: actions/checkout@v4

``




https://github.com/llvm/llvm-project/pull/125588
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread via llvm-branch-commits


@@ -10,13 +10,13 @@ Header:
   PartOffsets: [ 60 ]
 Parts:
   - Name:RTS0
-Size:8
+Size:4

joaosaffran wrote:

Wrong rebase, fixed it

https://github.com/llvm/llvm-project/pull/123147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread via llvm-branch-commits


@@ -829,6 +829,7 @@ option (LLVM_ENABLE_DOXYGEN "Use doxygen to generate llvm 
API documentation." OF
 option (LLVM_ENABLE_SPHINX "Use Sphinx to generate llvm documentation." OFF)
 option (LLVM_ENABLE_OCAMLDOC "Build OCaml bindings documentation." ON)
 option (LLVM_ENABLE_BINDINGS "Build bindings." ON)
+option (LLVM_BUILD_TELEMETRY "Build the telemtry library. This does not enable 
telemetry." ON)

cmtice wrote:

Typo: "telemetry"

https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread via llvm-branch-commits


@@ -0,0 +1,158 @@
+//===- DXILRootSignature.cpp - DXIL Root Signature helper objects ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+///
+/// \file This file contains helper objects and APIs for working with DXIL
+///   Root Signatures.
+///
+//===--===//
+#include "DXILRootSignature.h"
+#include "DirectX.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Module.h"
+#include 
+
+using namespace llvm;
+using namespace llvm::dxil;
+
+static bool reportError(Twine Message) {
+  report_fatal_error(Message, false);
+  return true;
+}
+
+static bool parseRootFlags(ModuleRootSignature *MRS, MDNode *RootFlagNode) {
+
+  if (RootFlagNode->getNumOperands() != 2)
+return reportError("Invalid format for RootFlag Element");
+
+  auto *Flag = mdconst::extract(RootFlagNode->getOperand(1));
+  uint32_t Value = Flag->getZExtValue();
+
+  // Root Element validation, as specified:
+  // 
https://github.com/llvm/wg-hlsl/blob/main/proposals/0002-root-signature-in-clang.md#validations-during-dxil-generation
+  if ((Value & ~0x8fff) != 0)
+return reportError("Invalid flag value for RootFlag");
+
+  MRS->Flags = Value;
+  return false;
+}
+
+static bool parseRootSignatureElement(ModuleRootSignature *MRS,
+  MDNode *Element) {
+  MDString *ElementText = cast(Element->getOperand(0));
+  if (ElementText == nullptr)
+return reportError("Invalid format for Root Element");
+
+  RootSignatureElementKind ElementKind =
+  StringSwitch(ElementText->getString())
+  .Case("RootFlags", RootSignatureElementKind::RootFlags)
+  .Case("RootConstants", RootSignatureElementKind::RootConstants)
+  .Case("RootCBV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootSRV", RootSignatureElementKind::RootDescriptor)
+  .Case("RootUAV", RootSignatureElementKind::RootDescriptor)
+  .Case("Sampler", RootSignatureElementKind::RootDescriptor)
+  .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable)
+  .Case("StaticSampler", RootSignatureElementKind::StaticSampler)
+  .Default(RootSignatureElementKind::None);
+
+  switch (ElementKind) {
+
+  case RootSignatureElementKind::RootFlags: {
+return parseRootFlags(MRS, Element);
+break;
+  }
+
+  case RootSignatureElementKind::RootConstants:
+  case RootSignatureElementKind::RootDescriptor:
+  case RootSignatureElementKind::DescriptorTable:
+  case RootSignatureElementKind::StaticSampler:
+  case RootSignatureElementKind::None:
+return reportError("Invalid Root Element: " + ElementText->getString());
+break;
+  }
+
+  return true;
+}
+
+bool ModuleRootSignature::parse(NamedMDNode *Root) {
+  bool HasError = false;
+
+  /** Root Signature are specified as following in the metadata:
+
+  !dx.rootsignatures = !{!2} ; list of function/root signature pairs
+  !2 = !{ ptr @main, !3 } ; function, root signature
+  !3 = !{ !4, !5, !6, !7 } ; list of root signature elements
+
+  So for each MDNode inside dx.rootsignatures NamedMDNode
+  (the Root parameter of this function), the parsing process needs
+  to loop through each of it's operand and process the pairs function
+  signature pair.
+   */
+
+  for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) {
+MDNode *Node = dyn_cast(Root->getOperand(Sid));
+
+if (Node == nullptr || Node->getNumOperands() != 2)
+  return reportError("Invalid format for Root Signature Definition. Pairs "
+ "of function, root signature expected.");
+
+// Get the Root Signature Description from the function signature pair.
+MDNode *RS = dyn_cast(Node->getOperand(1).get());
+
+if (RS == nullptr)
+  return reportError("Missing Root Signature Metadata node.");
+
+// Loop through the Root Elements of the root signature.
+for (unsigned int Eid = 0; Eid < RS->getNumOperands(); Eid++) {
+
+  MDNode *Element = dyn_cast(RS->getOperand(Eid));
+  if (Element == nullptr)
+return reportError("Missing Root Element Metadata Node.");
+
+  HasError = HasError || parseRootSignatureElement(this, Element);
+}
+  }
+  return HasError;
+}
+
+ModuleRootSignature ModuleRootSignature::analyzeModule(Module &M) {
+  ModuleRootSignature MRS;
+
+  NamedMDNode *RootSignatureNode = M.getNamedMetadata("dx.rootsignatures");
+  if (RootSignatureNode) {
+if (MRS.parse(RootSignatureNode))
+  llvm_unreachable("Invalid Root Signature Metadata.");

joaosaffran wrote:

We can do that, but that would ignore the return value from

[llvm-branch-commits] [llvm] release/20.x: [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (PR #125624)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/125624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (PR #125624)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/125624

Backport fbe470c1b215e3f953a41db6b91d20ce0bcf5c4e

Requested by: @brad0

>From 1b6946b60080e057d5848cea36ce801ddf2a43f6 Mon Sep 17 00:00:00 2001
From: Brad Smith 
Date: Mon, 3 Feb 2025 22:43:43 -0500
Subject: [PATCH] [benchmark] Get number of CPUs with sysconf() on Linux
 (#125603)

(cherry picked from commit c24774dc4f4402c3ad150363321cc972ed2669e7)
(cherry picked from commit fbe470c1b215e3f953a41db6b91d20ce0bcf5c4e)
---
 third-party/benchmark/src/sysinfo.cc | 53 ++--
 1 file changed, 3 insertions(+), 50 deletions(-)

diff --git a/third-party/benchmark/src/sysinfo.cc 
b/third-party/benchmark/src/sysinfo.cc
index 2bed1663af2e95..8283a081ee80b4 100644
--- a/third-party/benchmark/src/sysinfo.cc
+++ b/third-party/benchmark/src/sysinfo.cc
@@ -495,14 +495,14 @@ int GetNumCPUsImpl() {
   return sysinfo.dwNumberOfProcessors;  // number of logical
 // processors in the current
 // group
-#elif defined(BENCHMARK_OS_SOLARIS)
+#elif defined(__linux__) || defined(BENCHMARK_OS_SOLARIS)
   // Returns -1 in case of a failure.
-  long num_cpu = sysconf(_SC_NPROCESSORS_ONLN);
+  int num_cpu = static_cast(sysconf(_SC_NPROCESSORS_ONLN));
   if (num_cpu < 0) {
 PrintErrorAndDie("sysconf(_SC_NPROCESSORS_ONLN) failed with error: ",
  strerror(errno));
   }
-  return (int)num_cpu;
+  return num_cpu;
 #elif defined(BENCHMARK_OS_QNX)
   return static_cast(_syspage_ptr->num_cpu);
 #elif defined(BENCHMARK_OS_QURT)
@@ -511,53 +511,6 @@ int GetNumCPUsImpl() {
 hardware_threads.max_hthreads = 1;
   }
   return hardware_threads.max_hthreads;
-#else
-  int num_cpus = 0;
-  int max_id = -1;
-  std::ifstream f("/proc/cpuinfo");
-  if (!f.is_open()) {
-PrintErrorAndDie("Failed to open /proc/cpuinfo");
-  }
-#if defined(__alpha__)
-  const std::string Key = "cpus detected";
-#else
-  const std::string Key = "processor";
-#endif
-  std::string ln;
-  while (std::getline(f, ln)) {
-if (ln.empty()) continue;
-std::size_t split_idx = ln.find(':');
-std::string value;
-#if defined(__s390__)
-// s390 has another format in /proc/cpuinfo
-// it needs to be parsed differently
-if (split_idx != std::string::npos)
-  value = ln.substr(Key.size() + 1, split_idx - Key.size() - 1);
-#else
-if (split_idx != std::string::npos) value = ln.substr(split_idx + 1);
-#endif
-if (ln.size() >= Key.size() && ln.compare(0, Key.size(), Key) == 0) {
-  num_cpus++;
-  if (!value.empty()) {
-const int cur_id = benchmark::stoi(value);
-max_id = std::max(cur_id, max_id);
-  }
-}
-  }
-  if (f.bad()) {
-PrintErrorAndDie("Failure reading /proc/cpuinfo");
-  }
-  if (!f.eof()) {
-PrintErrorAndDie("Failed to read to end of /proc/cpuinfo");
-  }
-  f.close();
-
-  if ((max_id + 1) != num_cpus) {
-fprintf(stderr,
-"CPU ID assignments in /proc/cpuinfo seem messed up."
-" This is usually caused by a bad BIOS.\n");
-  }
-  return num_cpus;
 #endif
   BENCHMARK_UNREACHABLE();
 }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (PR #125624)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-third-party-benchmark

Author: None (llvmbot)


Changes

Backport fbe470c1b215e3f953a41db6b91d20ce0bcf5c4e

Requested by: @brad0

---
Full diff: https://github.com/llvm/llvm-project/pull/125624.diff


1 Files Affected:

- (modified) third-party/benchmark/src/sysinfo.cc (+3-50) 


``diff
diff --git a/third-party/benchmark/src/sysinfo.cc 
b/third-party/benchmark/src/sysinfo.cc
index 2bed1663af2e955..8283a081ee80b4a 100644
--- a/third-party/benchmark/src/sysinfo.cc
+++ b/third-party/benchmark/src/sysinfo.cc
@@ -495,14 +495,14 @@ int GetNumCPUsImpl() {
   return sysinfo.dwNumberOfProcessors;  // number of logical
 // processors in the current
 // group
-#elif defined(BENCHMARK_OS_SOLARIS)
+#elif defined(__linux__) || defined(BENCHMARK_OS_SOLARIS)
   // Returns -1 in case of a failure.
-  long num_cpu = sysconf(_SC_NPROCESSORS_ONLN);
+  int num_cpu = static_cast(sysconf(_SC_NPROCESSORS_ONLN));
   if (num_cpu < 0) {
 PrintErrorAndDie("sysconf(_SC_NPROCESSORS_ONLN) failed with error: ",
  strerror(errno));
   }
-  return (int)num_cpu;
+  return num_cpu;
 #elif defined(BENCHMARK_OS_QNX)
   return static_cast(_syspage_ptr->num_cpu);
 #elif defined(BENCHMARK_OS_QURT)
@@ -511,53 +511,6 @@ int GetNumCPUsImpl() {
 hardware_threads.max_hthreads = 1;
   }
   return hardware_threads.max_hthreads;
-#else
-  int num_cpus = 0;
-  int max_id = -1;
-  std::ifstream f("/proc/cpuinfo");
-  if (!f.is_open()) {
-PrintErrorAndDie("Failed to open /proc/cpuinfo");
-  }
-#if defined(__alpha__)
-  const std::string Key = "cpus detected";
-#else
-  const std::string Key = "processor";
-#endif
-  std::string ln;
-  while (std::getline(f, ln)) {
-if (ln.empty()) continue;
-std::size_t split_idx = ln.find(':');
-std::string value;
-#if defined(__s390__)
-// s390 has another format in /proc/cpuinfo
-// it needs to be parsed differently
-if (split_idx != std::string::npos)
-  value = ln.substr(Key.size() + 1, split_idx - Key.size() - 1);
-#else
-if (split_idx != std::string::npos) value = ln.substr(split_idx + 1);
-#endif
-if (ln.size() >= Key.size() && ln.compare(0, Key.size(), Key) == 0) {
-  num_cpus++;
-  if (!value.empty()) {
-const int cur_id = benchmark::stoi(value);
-max_id = std::max(cur_id, max_id);
-  }
-}
-  }
-  if (f.bad()) {
-PrintErrorAndDie("Failure reading /proc/cpuinfo");
-  }
-  if (!f.eof()) {
-PrintErrorAndDie("Failed to read to end of /proc/cpuinfo");
-  }
-  f.close();
-
-  if ((max_id + 1) != num_cpus) {
-fprintf(stderr,
-"CPU ID assignments in /proc/cpuinfo seem messed up."
-" This is usually caused by a bad BIOS.\n");
-  }
-  return num_cpus;
 #endif
   BENCHMARK_UNREACHABLE();
 }

``




https://github.com/llvm/llvm-project/pull/125624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread via llvm-branch-commits




joaosaffran wrote:

I am not really sure we can have multiple root signatures in the backend. It is 
possible in HLSL because we can specify the entry function, therefore you can 
have multiple entries in a single file. However, when lowering into 
DXContainer, the binary format only allows a single signature to be present. 

I've reached other members of the team to discuss if this actually the case of 
if I am missing something.

https://github.com/llvm/llvm-project/pull/123147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread Jonas Devlieghere via llvm-branch-commits

JDevlieghere wrote:

If I type `/cherrypick 13ded6829bf7ca793795c50d47dd2b95482e5cfa` will it add 
that commit to this PR?

https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread Jonas Devlieghere via llvm-branch-commits


@@ -829,6 +829,7 @@ option (LLVM_ENABLE_DOXYGEN "Use doxygen to generate llvm 
API documentation." OF
 option (LLVM_ENABLE_SPHINX "Use Sphinx to generate llvm documentation." OFF)
 option (LLVM_ENABLE_OCAMLDOC "Build OCaml bindings documentation." ON)
 option (LLVM_ENABLE_BINDINGS "Build bindings." ON)
+option (LLVM_BUILD_TELEMETRY "Build the telemtry library. This does not enable 
telemetry." ON)

JDevlieghere wrote:

@nikic spotted it too. Fixed in 13ded6829bf7ca793795c50d47dd2b95482e5cfa. 

https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [Clang][ReleaseNotes] Document -fclang-abi-compat=19 re: #110503 (PR #125368)

2025-02-03 Thread John McCall via llvm-branch-commits

rjmccall wrote:

It's approved anyway. Thanks, Tom.

https://github.com/llvm/llvm-project/pull/125368
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions (PR #125484)

2025-02-03 Thread Alan Phipps via llvm-branch-commits

evodius96 wrote:

This is pertinent to #109930 from Validas as well, and they would like 
something like this to be included by default to make it less confusing for 
users of MC/DC.  On the other hand, as I point out in the issue, I think 
introducing MC/DC for single-conditions is redundant (given branch coverage) 
and introduces unnecessary overhead. So, that's an argument for keeping it as a 
separate option and perhaps finding another way to work branch coverage into 
the overall MC/DC metric.

Also, Branch Coverage gives you counts for switch statement cases, whereas I'm 
not sure that makes sense for single-condition MC/DC, though perhaps that's 
overkill.

https://github.com/llvm/llvm-project/pull/125484
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) (PR #125590)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/125590

Backport 7c5100d36d8027dd205d6ec410a63c3930a1d9c1

Requested by: @topperc

>From bfc522cfd54c79a8ed833dfbb19285df05c3c4e8 Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Mon, 3 Feb 2025 13:48:42 -0800
Subject: [PATCH] [RISCV] Check isFixedLengthVector before calling
 getVectorNumElements in getSingleShuffleSrc. (#125455)

I have been unsuccessful at further reducing the test. The
failure requires a shuffle with 2 scalable->fixed extracts with
the same source. 0 is the only valid index for a scalable->fixed
extract so the 2 sources must be the same extract. Shuffles with
the same source are aggressively canonicalized to a unary shuffle.
So it requires the extracts to become identical through other
optimizations without the shuffle being canonicalized before it is
lowered.

Fixes #125306.

(cherry picked from commit 7c5100d36d8027dd205d6ec410a63c3930a1d9c1)
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp |   3 +-
 llvm/test/CodeGen/RISCV/rvv/pr125306.ll | 118 
 2 files changed, 120 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/RISCV/rvv/pr125306.ll

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 8d09e534b1858bc..4ff333b1ff2f7a6 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4512,7 +4512,8 @@ static SDValue getSingleShuffleSrc(MVT VT, MVT 
ContainerVT, SDValue V1,
 
   // Src needs to have twice the number of elements.
   unsigned NumElts = VT.getVectorNumElements();
-  if (Src.getValueType().getVectorNumElements() != (NumElts * 2))
+  if (!Src.getValueType().isFixedLengthVector() ||
+  Src.getValueType().getVectorNumElements() != (NumElts * 2))
 return SDValue();
 
   // The extracts must extract the two halves of the source.
diff --git a/llvm/test/CodeGen/RISCV/rvv/pr125306.ll 
b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll
new file mode 100644
index 000..111f87de220dbfa
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll
@@ -0,0 +1,118 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=riscv64 -mattr=+m,+v | FileCheck %s
+
+; Test for an "Invalid size request on a scalable vector". Attempts to reduce
+; the test faurther were not successful. The failure requires a shuffle with 2
+; scalable->fixed extracts from the same vector. 0 is the only valid index for 
a
+; scalable->fixed extract so the 2 extract must be the same. Shuffles with the
+; same source are aggressively canonicalized to a unary shuffle so it requires
+; the extracts to become identical through other optimizations without the
+; shuffle being canonicalized before it is lowered.
+
+define <2 x i32> @main(ptr %0) {
+; CHECK-LABEL: main:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:vsetivli zero, 16, e32, m4, ta, ma
+; CHECK-NEXT:vmv.v.i v8, 0
+; CHECK-NEXT:vse32.v v8, (zero)
+; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:vmv.v.i v8, 0
+; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:vmv.v.i v10, 0
+; CHECK-NEXT:li a2, 64
+; CHECK-NEXT:sw zero, 80(zero)
+; CHECK-NEXT:lui a1, 7
+; CHECK-NEXT:lui a3, 1
+; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma
+; CHECK-NEXT:vid.v v11
+; CHECK-NEXT:li a4, 16
+; CHECK-NEXT:lui a5, 2
+; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:vse32.v v10, (a2)
+; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma
+; CHECK-NEXT:vmv.v.i v10, 0
+; CHECK-NEXT:li a2, 24
+; CHECK-NEXT:sh zero, -392(a3)
+; CHECK-NEXT:sh zero, 534(a3)
+; CHECK-NEXT:sh zero, 1460(a3)
+; CHECK-NEXT:li a3, 32
+; CHECK-NEXT:vse32.v v10, (a2)
+; CHECK-NEXT:li a2, 40
+; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:vse32.v v8, (a0)
+; CHECK-NEXT:sh zero, -1710(a5)
+; CHECK-NEXT:sh zero, -784(a5)
+; CHECK-NEXT:sh zero, 142(a5)
+; CHECK-NEXT:lw a5, -304(a1)
+; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma
+; CHECK-NEXT:vadd.vi v9, v11, -1
+; CHECK-NEXT:vse32.v v10, (a3)
+; CHECK-NEXT:sh zero, 0(a0)
+; CHECK-NEXT:lw a0, -188(a1)
+; CHECK-NEXT:vse32.v v10, (a2)
+; CHECK-NEXT:lw a2, -188(a1)
+; CHECK-NEXT:lw a3, 1244(a1)
+; CHECK-NEXT:vmv.v.x v8, a0
+; CHECK-NEXT:lw a0, 1244(a1)
+; CHECK-NEXT:lw a1, -304(a1)
+; CHECK-NEXT:vmv.v.x v10, a3
+; CHECK-NEXT:vmv.v.x v11, a5
+; CHECK-NEXT:vslide1down.vx v8, v8, zero
+; CHECK-NEXT:vslide1down.vx v10, v10, zero
+; CHECK-NEXT:vmin.vv v8, v10, v8
+; CHECK-NEXT:vmv.v.x v10, a0
+; CHECK-NEXT:vslide1down.vx v11, v11, zero
+; CHECK-NEXT:vmin.vx v10, v10, a2
+; CHECK-NEXT:vmin.vx v10, v10, a1
+; CHECK-NEXT:vmin.vv v11, v8, v11
+; CHECK-NEXT:vmv1r.v v8, v10
+; CHECK-NEXT:vand.vv v9, v11, v9
+; CH

[llvm-branch-commits] [llvm] release/20.x: [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) (PR #125590)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/125590
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) (PR #125590)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (llvmbot)


Changes

Backport 7c5100d36d8027dd205d6ec410a63c3930a1d9c1

Requested by: @topperc

---
Full diff: https://github.com/llvm/llvm-project/pull/125590.diff


2 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+2-1) 
- (added) llvm/test/CodeGen/RISCV/rvv/pr125306.ll (+118) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 8d09e534b1858b..4ff333b1ff2f7a 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4512,7 +4512,8 @@ static SDValue getSingleShuffleSrc(MVT VT, MVT 
ContainerVT, SDValue V1,
 
   // Src needs to have twice the number of elements.
   unsigned NumElts = VT.getVectorNumElements();
-  if (Src.getValueType().getVectorNumElements() != (NumElts * 2))
+  if (!Src.getValueType().isFixedLengthVector() ||
+  Src.getValueType().getVectorNumElements() != (NumElts * 2))
 return SDValue();
 
   // The extracts must extract the two halves of the source.
diff --git a/llvm/test/CodeGen/RISCV/rvv/pr125306.ll 
b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll
new file mode 100644
index 00..111f87de220dbf
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll
@@ -0,0 +1,118 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=riscv64 -mattr=+m,+v | FileCheck %s
+
+; Test for an "Invalid size request on a scalable vector". Attempts to reduce
+; the test faurther were not successful. The failure requires a shuffle with 2
+; scalable->fixed extracts from the same vector. 0 is the only valid index for 
a
+; scalable->fixed extract so the 2 extract must be the same. Shuffles with the
+; same source are aggressively canonicalized to a unary shuffle so it requires
+; the extracts to become identical through other optimizations without the
+; shuffle being canonicalized before it is lowered.
+
+define <2 x i32> @main(ptr %0) {
+; CHECK-LABEL: main:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:vsetivli zero, 16, e32, m4, ta, ma
+; CHECK-NEXT:vmv.v.i v8, 0
+; CHECK-NEXT:vse32.v v8, (zero)
+; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:vmv.v.i v8, 0
+; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:vmv.v.i v10, 0
+; CHECK-NEXT:li a2, 64
+; CHECK-NEXT:sw zero, 80(zero)
+; CHECK-NEXT:lui a1, 7
+; CHECK-NEXT:lui a3, 1
+; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma
+; CHECK-NEXT:vid.v v11
+; CHECK-NEXT:li a4, 16
+; CHECK-NEXT:lui a5, 2
+; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:vse32.v v10, (a2)
+; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma
+; CHECK-NEXT:vmv.v.i v10, 0
+; CHECK-NEXT:li a2, 24
+; CHECK-NEXT:sh zero, -392(a3)
+; CHECK-NEXT:sh zero, 534(a3)
+; CHECK-NEXT:sh zero, 1460(a3)
+; CHECK-NEXT:li a3, 32
+; CHECK-NEXT:vse32.v v10, (a2)
+; CHECK-NEXT:li a2, 40
+; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:vse32.v v8, (a0)
+; CHECK-NEXT:sh zero, -1710(a5)
+; CHECK-NEXT:sh zero, -784(a5)
+; CHECK-NEXT:sh zero, 142(a5)
+; CHECK-NEXT:lw a5, -304(a1)
+; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma
+; CHECK-NEXT:vadd.vi v9, v11, -1
+; CHECK-NEXT:vse32.v v10, (a3)
+; CHECK-NEXT:sh zero, 0(a0)
+; CHECK-NEXT:lw a0, -188(a1)
+; CHECK-NEXT:vse32.v v10, (a2)
+; CHECK-NEXT:lw a2, -188(a1)
+; CHECK-NEXT:lw a3, 1244(a1)
+; CHECK-NEXT:vmv.v.x v8, a0
+; CHECK-NEXT:lw a0, 1244(a1)
+; CHECK-NEXT:lw a1, -304(a1)
+; CHECK-NEXT:vmv.v.x v10, a3
+; CHECK-NEXT:vmv.v.x v11, a5
+; CHECK-NEXT:vslide1down.vx v8, v8, zero
+; CHECK-NEXT:vslide1down.vx v10, v10, zero
+; CHECK-NEXT:vmin.vv v8, v10, v8
+; CHECK-NEXT:vmv.v.x v10, a0
+; CHECK-NEXT:vslide1down.vx v11, v11, zero
+; CHECK-NEXT:vmin.vx v10, v10, a2
+; CHECK-NEXT:vmin.vx v10, v10, a1
+; CHECK-NEXT:vmin.vv v11, v8, v11
+; CHECK-NEXT:vmv1r.v v8, v10
+; CHECK-NEXT:vand.vv v9, v11, v9
+; CHECK-NEXT:vslideup.vi v8, v10, 1
+; CHECK-NEXT:vse32.v v9, (a4)
+; CHECK-NEXT:sh zero, 0(zero)
+; CHECK-NEXT:ret
+entry:
+  store <16 x i32> zeroinitializer, ptr null, align 4
+  store <8 x i32> zeroinitializer, ptr %0, align 4
+  store <4 x i32> zeroinitializer, ptr getelementptr inbounds nuw (i8, ptr 
null, i64 64), align 4
+  store i32 0, ptr getelementptr inbounds nuw (i8, ptr null, i64 80), align 4
+  %1 = load i32, ptr getelementptr inbounds nuw (i8, ptr null, i64 29916), 
align 4
+  %broadcast.splatinsert53 = insertelement <4 x i32> zeroinitializer, i32 %1, 
i64 0
+  %2 = load i32, ptr getelementptr inbounds nuw (i8, ptr null, i64 28484), 
align 4
+  %broadcast.splatinsert55 = insertelement <4 x i32> zeroinitializer, i32 %2, 
i64 0
+  %3 = call <4 x i32> @llvm.smin.v4i32(<4 x

[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-03 Thread Tom Eccles via llvm-branch-commits


@@ -2612,7 +2612,54 @@ static void
 genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval,
const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) {
-  TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct");
+  mlir::Location loc = converter.genLocation(declareMapperConstruct.source);
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  lower::StatementContext stmtCtx;
+  const auto &spec =
+  std::get(declareMapperConstruct.t);
+  const auto &mapperName{std::get>(spec.t)};
+  const auto &varType{std::get(spec.t)};
+  const auto &varName{std::get(spec.t)};
+  assert(varType.declTypeSpec->category() ==
+ semantics::DeclTypeSpec::Category::TypeDerived &&
+ "Expected derived type");
+
+  std::string mapperNameStr;
+  if (mapperName.has_value()) {
+mapperNameStr = mapperName->ToString();
+mapperNameStr =
+converter.mangleName(mapperNameStr, mapperName->symbol->owner());
+  } else {
+mapperNameStr =
+varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default";
+mapperNameStr = converter.mangleName(
+mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope());
+  }
+
+  // Save insert point just after the DeclMapperOp.
+  mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint();
+
+  firOpBuilder.setInsertionPointToStart(converter.getModuleOp().getBody());
+  auto mlirType = converter.genType(varType.declTypeSpec->derivedTypeSpec());
+  auto declMapperOp = firOpBuilder.create(
+  loc, mapperNameStr, mlirType);
+  converter.getMLIRSymbolTable()->insert(declMapperOp);
+  auto ®ion = declMapperOp.getRegion();
+  firOpBuilder.createBlock(®ion);
+  auto varVal = region.addArgument(firOpBuilder.getRefType(mlirType), loc);
+  converter.bindSymbol(*varName.symbol, varVal);
+
+  // Populate the declareMapper region with the map information.
+  mlir::omp::DeclareMapperInfoOperands clauseOps;
+  const auto *clauseList{
+  parser::Unwrap(declareMapperConstruct.t)};
+  List clauses = makeClauses(*clauseList, semaCtx);
+  ClauseProcessor cp(converter, semaCtx, clauses);
+  cp.processMap(loc, stmtCtx, clauseOps);
+  firOpBuilder.create(loc, clauseOps.mapVars);
+
+  // Restore the insert point to just after the DeclareMapperOp.
+  firOpBuilder.restoreInsertionPoint(insPt);

tblah wrote:

With my change to use the insertion point guard above
```suggestion
```

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-03 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-03 Thread Tom Eccles via llvm-branch-commits


@@ -2612,7 +2612,54 @@ static void
 genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval,
const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) {
-  TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct");
+  mlir::Location loc = converter.genLocation(declareMapperConstruct.source);
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  lower::StatementContext stmtCtx;
+  const auto &spec =
+  std::get(declareMapperConstruct.t);
+  const auto &mapperName{std::get>(spec.t)};
+  const auto &varType{std::get(spec.t)};
+  const auto &varName{std::get(spec.t)};
+  assert(varType.declTypeSpec->category() ==
+ semantics::DeclTypeSpec::Category::TypeDerived &&
+ "Expected derived type");
+
+  std::string mapperNameStr;
+  if (mapperName.has_value()) {
+mapperNameStr = mapperName->ToString();
+mapperNameStr =
+converter.mangleName(mapperNameStr, mapperName->symbol->owner());
+  } else {
+mapperNameStr =
+varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default";
+mapperNameStr = converter.mangleName(
+mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope());
+  }
+
+  // Save insert point just after the DeclMapperOp.
+  mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint();

tblah wrote:

```suggestion
  // Save current insertion point before moving to the module scope to create 
the DeclareMapperOp
  mlir::OpBuilder::InsertionGuard guard(builder);
```

I don't think it makes sense to say the insert point is before or after the 
DeclMapperOp because the DeclMapperOp will be at module scope. I've also 
changed to use the insertion point guard because it is less error prone if a 
conditional early return is added later.

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-03 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah commented:

I have some minor suggestions on the code. Please wait for review from somebody 
with more familiarity with omp target things, and this is conditional on the 
design of the MLIR operation being approved.

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/20.x: [ELF] Refine isExported/isPreemptible condition (PR #125334)

2025-02-03 Thread Nikita Popov via llvm-branch-commits

nikic wrote:

Reverted in 
https://github.com/llvm/llvm-project/commit/b84f7d17f84030092880857544e13d26a2507c62,
 as this has been failing all pre-merge tests for the last two or three days 
already.

https://github.com/llvm/llvm-project/pull/125334
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] NFC: rename MatchedPackOnParmToNonPackOnArg to StrictPackMatch (PR #125418)

2025-02-03 Thread via llvm-branch-commits

https://github.com/cor3ntin edited 
https://github.com/llvm/llvm-project/pull/125418
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-03 Thread Akash Banerjee via llvm-branch-commits

TIFitis wrote:

Polite request for review 🙂

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-03 Thread Akash Banerjee via llvm-branch-commits


@@ -2612,7 +2612,54 @@ static void
 genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval,
const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) {
-  TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct");
+  mlir::Location loc = converter.genLocation(declareMapperConstruct.source);
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  lower::StatementContext stmtCtx;
+  const auto &spec =
+  std::get(declareMapperConstruct.t);
+  const auto &mapperName{std::get>(spec.t)};
+  const auto &varType{std::get(spec.t)};
+  const auto &varName{std::get(spec.t)};
+  assert(varType.declTypeSpec->category() ==
+ semantics::DeclTypeSpec::Category::TypeDerived &&
+ "Expected derived type");
+
+  std::string mapperNameStr;
+  if (mapperName.has_value()) {
+mapperNameStr = mapperName->ToString();
+mapperNameStr =
+converter.mangleName(mapperNameStr, mapperName->symbol->owner());
+  } else {
+mapperNameStr =
+varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default";
+mapperNameStr = converter.mangleName(
+mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope());
+  }
+
+  // Save insert point just after the DeclMapperOp.
+  mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint();

TIFitis wrote:

Done :)

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)

2025-02-03 Thread Justin Bogner via llvm-branch-commits


@@ -80,6 +85,99 @@ class RootSignatureLexer {
   }
 };
 
+class RootSignatureParser {
+public:
+  RootSignatureParser(SmallVector &Elements,
+  const SmallVector &Tokens,
+  DiagnosticsEngine &Diags);
+
+  // Iterates over the provided tokens and constructs the in-memory
+  // representations of the RootElements.
+  //
+  // The return value denotes if there was a failure and the method will
+  // return on the first encountered failure, or, return false if it
+  // can sucessfully reach the end of the tokens.

bogner wrote:

If you use `///` instead of `//` for the doc comments on the methods of this 
class they'll show up in the [LLVM doxygen](https://llvm.org/doxygen/).

https://github.com/llvm/llvm-project/pull/122982
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)

2025-02-03 Thread Justin Bogner via llvm-branch-commits


@@ -15,16 +15,21 @@
 
 #include "clang/AST/APValue.h"
 #include "clang/Basic/DiagnosticLex.h"
+#include "clang/Basic/DiagnosticParse.h"
 #include "clang/Lex/LiteralSupport.h"
 #include "clang/Lex/Preprocessor.h"
 
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/StringSwitch.h"
 
+#include "llvm/Frontend/HLSL/HLSLRootSignature.h"
+
 namespace clang {
 namespace hlsl {
 
+namespace rs = llvm::hlsl::root_signature;

bogner wrote:

I'm not convinced the brevity this affords later is really worth it, and since 
this is in a public header it introduces `llvm::hlsl::rs` into far more scopes 
than just this file, which IMO is just a recipe for confusion.

https://github.com/llvm/llvm-project/pull/122982
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)

2025-02-03 Thread Justin Bogner via llvm-branch-commits


@@ -80,6 +85,99 @@ class RootSignatureLexer {
   }
 };
 
+class RootSignatureParser {
+public:
+  RootSignatureParser(SmallVector &Elements,
+  const SmallVector &Tokens,
+  DiagnosticsEngine &Diags);
+
+  // Iterates over the provided tokens and constructs the in-memory
+  // representations of the RootElements.
+  //
+  // The return value denotes if there was a failure and the method will
+  // return on the first encountered failure, or, return false if it
+  // can sucessfully reach the end of the tokens.
+  bool Parse();
+
+private:
+  // Root Element helpers
+  bool ParseRootElement(bool First);
+  bool ParseDescriptorTable();
+  bool ParseDescriptorTableClause();
+
+  // Helper dispatch method
+  //
+  // These will switch on the Variant kind to dispatch to the respective Parse
+  // method and store the parsed value back into Ref.
+  //
+  // It is helpful to have a generalized dispatch method so that when we need
+  // to parse multiple optional parameters in any order, we can invoke this
+  // method
+  bool ParseParam(rs::ParamType Ref);
+
+  // Parse as many optional parameters as possible in any order
+  bool
+  ParseOptionalParams(llvm::SmallDenseMap RefMap);

bogner wrote:

Do you really want to be passing a `SmallDenseMap` by value here? This and the 
other similar APIs should probably be using references.

https://github.com/llvm/llvm-project/pull/122982
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)

2025-02-03 Thread Justin Bogner via llvm-branch-commits


@@ -148,6 +148,347 @@ bool RootSignatureLexer::LexToken(RootSignatureToken 
&Result) {
   return false;
 }
 
+// Parser Definitions
+
+RootSignatureParser::RootSignatureParser(
+SmallVector &Elements,
+const SmallVector &Tokens)
+: Elements(Elements) {
+  CurTok = Tokens.begin();
+  LastTok = Tokens.end();
+}
+
+bool RootSignatureParser::ReportError() { return true; }
+
+bool RootSignatureParser::Parse() {
+  // Handle edge-case of empty RootSignature()
+  if (CurTok == LastTok)
+return false;
+
+  // Iterate as many RootElements as possible
+  bool HasComma = true;
+  while (HasComma &&
+ IsCurExpectedToken(ArrayRef{TokenKind::kw_DescriptorTable})) {
+if (ParseRootElement())
+  return true;
+HasComma = !TryConsumeExpectedToken(TokenKind::pu_comma);
+if (HasComma)
+  ConsumeNextToken();
+  }
+
+  if (HasComma)
+return ReportError(); // report 'comma' denotes a required extra item
+
+  // Ensure that we are at the end of the tokens
+  CurTok++;
+  if (CurTok != LastTok)
+return ReportError(); // report expected end of input but got more
+  return false;
+}
+
+bool RootSignatureParser::ParseRootElement() {
+  // Dispatch onto the correct parse method
+  switch (CurTok->Kind) {
+  case TokenKind::kw_DescriptorTable:
+return ParseDescriptorTable();
+  default:
+llvm_unreachable("Switch for an expected token was not provided");
+return true;

bogner wrote:

`llvm_unreachable` doesn't return (it is annotated as a "noreturn" function), 
so the `return true` is unreachable code here. This won't generally lead to the 
parser erroring out in practice, because the program will crash or hit UB due 
to the `llvm_unreachable`. Given that we expect to have actual error handling 
for the token kind being appropriate just above, this is probably okay, but it 
does mean that the `return true` is unnecessary.

https://github.com/llvm/llvm-project/pull/122982
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/12

Backport bac62ee

Requested by: @JDevlieghere

>From fa13ea757382a003468e2c30d978eba0a1dfcbf5 Mon Sep 17 00:00:00 2001
From: Jonas Devlieghere 
Date: Mon, 3 Feb 2025 10:35:14 -0800
Subject: [PATCH] [llvm] Add CMake flag to compile out the telemetry framework
 (#124850)

Add a CMake flag (LLVM_BUILD_TELEMETRY) to disable building the
telemetry framework. The flag being enabled does *not* mean that
telemetry is being collected, it merely means we're building the generic
telemetry framework. Hence the flag is enabled by default.

Motivated by this Discourse thread:
https://discourse.llvm.org/t/how-to-disable-building-llvm-clang-telemetry/84305

(cherry picked from commit bac62ee5b473e70981a6bd9759ec316315fca07d)
---
 llvm/CMakeLists.txt   | 1 +
 llvm/lib/CMakeLists.txt   | 4 +++-
 llvm/unittests/CMakeLists.txt | 4 +++-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index c9ff3696e22d698..d1b4c2700ce8ef7 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -829,6 +829,7 @@ option (LLVM_ENABLE_DOXYGEN "Use doxygen to generate llvm 
API documentation." OF
 option (LLVM_ENABLE_SPHINX "Use Sphinx to generate llvm documentation." OFF)
 option (LLVM_ENABLE_OCAMLDOC "Build OCaml bindings documentation." ON)
 option (LLVM_ENABLE_BINDINGS "Build bindings." ON)
+option (LLVM_BUILD_TELEMETRY "Build the telemtry library. This does not enable 
telemetry." ON)
 
 set(LLVM_INSTALL_DOXYGEN_HTML_DIR "${CMAKE_INSTALL_DOCDIR}/llvm/doxygen-html"
 CACHE STRING "Doxygen-generated HTML documentation install directory")
diff --git a/llvm/lib/CMakeLists.txt b/llvm/lib/CMakeLists.txt
index f6465612d30c0b4..d0a2bc929438179 100644
--- a/llvm/lib/CMakeLists.txt
+++ b/llvm/lib/CMakeLists.txt
@@ -41,7 +41,9 @@ add_subdirectory(ProfileData)
 add_subdirectory(Passes)
 add_subdirectory(TargetParser)
 add_subdirectory(TextAPI)
-add_subdirectory(Telemetry)
+if (LLVM_BUILD_TELEMETRY)
+  add_subdirectory(Telemetry)
+endif()
 add_subdirectory(ToolDrivers)
 add_subdirectory(XRay)
 if (LLVM_INCLUDE_TESTS)
diff --git a/llvm/unittests/CMakeLists.txt b/llvm/unittests/CMakeLists.txt
index 81abce51b8939f0..12e229b1c349840 100644
--- a/llvm/unittests/CMakeLists.txt
+++ b/llvm/unittests/CMakeLists.txt
@@ -63,7 +63,9 @@ add_subdirectory(Support)
 add_subdirectory(TableGen)
 add_subdirectory(Target)
 add_subdirectory(TargetParser)
-add_subdirectory(Telemetry)
+if (LLVM_BUILD_TELEMETRY)
+  add_subdirectory(Telemetry)
+endif()
 add_subdirectory(Testing)
 add_subdirectory(TextAPI)
 add_subdirectory(Transforms)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:

@oontvoo What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)

2025-02-03 Thread Vy Nguyen via llvm-branch-commits

https://github.com/oontvoo approved this pull request.


https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-02-03 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

It had the 'needs review' status and I missed it.  We can get it into -rc2.

https://github.com/llvm/llvm-project/pull/125057
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Parse METADIRECTIVE in specification part (PR #123397)

2025-02-03 Thread Kiran Chandramohan via llvm-branch-commits

https://github.com/kiranchandramohan approved this pull request.

LG.

Declarative directives have to be propagated to module files but for the 
purpose of generating TODOs, this is not required.

https://github.com/llvm/llvm-project/pull/123397
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur ready_for_review 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)

2025-02-03 Thread via llvm-branch-commits
=?utf-8?q?Michał_Górny?= 
Message-ID: 
In-Reply-To:


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/125498

Backport 359a9131704277bce0f806de31ac887e68a66902 
689ef5fda0ab07dfc452cb16d3646d53e612cb75

Requested by: @mgorny

>From 94ba87f5c8faafa63ece54849362bab8a168ae00 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= 
Date: Sun, 2 Feb 2025 16:55:22 +0100
Subject: [PATCH 1/2] [offload] `gnu::format` with variadic template functions
 is Clang-only (#124406)

Use `gnu::format` attribute only when compiling with Clang, as using it
against variadic template functions is a Clang extension and is not
supported by GCC.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958

Fixes #119069

(cherry picked from commit 359a9131704277bce0f806de31ac887e68a66902)
---
 .../common/include/ErrorReporting.h| 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/offload/plugins-nextgen/common/include/ErrorReporting.h 
b/offload/plugins-nextgen/common/include/ErrorReporting.h
index 8478977a8f86af0..2ad0f2b7dd6c651 100644
--- a/offload/plugins-nextgen/common/include/ErrorReporting.h
+++ b/offload/plugins-nextgen/common/include/ErrorReporting.h
@@ -80,8 +80,10 @@ class ErrorReporter {
   /// Print \p Format, instantiated with \p Args to stderr.
   /// TODO: Allow redirection into a file stream.
   template 
-  [[gnu::format(__printf__, 1, 2)]] static void print(const char *Format,
-  ArgsTy &&...Args) {
+#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958
+  [[gnu::format(__printf__, 1, 2)]]
+#endif
+  static void print(const char *Format, ArgsTy &&...Args) {
 raw_fd_ostream OS(STDERR_FILENO, false);
 OS << llvm::format(Format, Args...);
   }
@@ -89,8 +91,10 @@ class ErrorReporter {
   /// Print \p Format, instantiated with \p Args to stderr, but colored.
   /// TODO: Allow redirection into a file stream.
   template 
-  [[gnu::format(__printf__, 2, 3)]] static void
-  print(ColorTy Color, const char *Format, ArgsTy &&...Args) {
+#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958
+  [[gnu::format(__printf__, 2, 3)]]
+#endif
+  static void print(ColorTy Color, const char *Format, ArgsTy &&...Args) {
 raw_fd_ostream OS(STDERR_FILENO, false);
 WithColor(OS, HighlightColor(Color)) << llvm::format(Format, Args...);
   }
@@ -99,8 +103,10 @@ class ErrorReporter {
   /// a banner.
   /// TODO: Allow redirection into a file stream.
   template 
-  [[gnu::format(__printf__, 1, 2)]] static void reportError(const char *Format,
-ArgsTy &&...Args) {
+#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958
+  [[gnu::format(__printf__, 1, 2)]]
+#endif
+  static void reportError(const char *Format, ArgsTy &&...Args) {
 print(BoldRed, "%s", ErrorBanner);
 print(BoldRed, Format, Args...);
 print("\n");

>From 1225c2eaf8c281ad9f83b49645779930c7dc2284 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= 
Date: Sun, 2 Feb 2025 16:55:39 +0100
Subject: [PATCH 2/2] [offload] [test] Use test compiler ID rather than host
 (#124408)

Use the test compiler ID to verify whether tests can be run rather than
the host compiler. This makes it possible to run tests (with Clang)
while the library itself was built with GCC.

(cherry picked from commit 689ef5fda0ab07dfc452cb16d3646d53e612cb75)
---
 offload/test/CMakeLists.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/offload/test/CMakeLists.txt b/offload/test/CMakeLists.txt
index 8a827e0a625eff0..4768d9ccf223bb4 100644
--- a/offload/test/CMakeLists.txt
+++ b/offload/test/CMakeLists.txt
@@ -1,6 +1,6 @@
 # CMakeLists.txt file for unit testing OpenMP offloading runtime library.
-if(NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang" OR
-   CMAKE_CXX_COMPILER_VERSION VERSION_LESS 6.0.0)
+if(NOT OPENMP_TEST_COMPILER_ID STREQUAL "Clang" OR
+   OPENMP_TEST_COMPILER_VERSION VERSION_LESS 6.0.0)
   message(STATUS "Can only test with Clang compiler in version 6.0.0 or 
later.")
   message(WARNING "The check-offload target will not be available!")
   return()

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)

2025-02-03 Thread via llvm-branch-commits
=?utf-8?q?Michał_Górny?= 
Message-ID:
In-Reply-To: 


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/125498
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)

2025-02-03 Thread via llvm-branch-commits
=?utf-8?q?Michał_Górny?= 
Message-ID:
In-Reply-To: 


llvmbot wrote:

@thesamesam @thesamesam What do you think about merging this PR to the release 
branch?

https://github.com/llvm/llvm-project/pull/125498
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)

2025-02-03 Thread Joseph Huber via llvm-branch-commits
=?utf-8?q?Micha=C5=82_G=C3=B3rny?= 
Message-ID:
In-Reply-To: 


https://github.com/jhuber6 approved this pull request.


https://github.com/llvm/llvm-project/pull/125498
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)

2025-02-03 Thread via llvm-branch-commits
=?utf-8?q?Michał_Górny?= 
Message-ID:
In-Reply-To: 


llvmbot wrote:




@llvm/pr-subscribers-offload

Author: None (llvmbot)


Changes

Backport 359a9131704277bce0f806de31ac887e68a66902 
689ef5fda0ab07dfc452cb16d3646d53e612cb75

Requested by: @mgorny

---
Full diff: https://github.com/llvm/llvm-project/pull/125498.diff


2 Files Affected:

- (modified) offload/plugins-nextgen/common/include/ErrorReporting.h (+12-6) 
- (modified) offload/test/CMakeLists.txt (+2-2) 


``diff
diff --git a/offload/plugins-nextgen/common/include/ErrorReporting.h 
b/offload/plugins-nextgen/common/include/ErrorReporting.h
index 8478977a8f86af0..2ad0f2b7dd6c651 100644
--- a/offload/plugins-nextgen/common/include/ErrorReporting.h
+++ b/offload/plugins-nextgen/common/include/ErrorReporting.h
@@ -80,8 +80,10 @@ class ErrorReporter {
   /// Print \p Format, instantiated with \p Args to stderr.
   /// TODO: Allow redirection into a file stream.
   template 
-  [[gnu::format(__printf__, 1, 2)]] static void print(const char *Format,
-  ArgsTy &&...Args) {
+#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958
+  [[gnu::format(__printf__, 1, 2)]]
+#endif
+  static void print(const char *Format, ArgsTy &&...Args) {
 raw_fd_ostream OS(STDERR_FILENO, false);
 OS << llvm::format(Format, Args...);
   }
@@ -89,8 +91,10 @@ class ErrorReporter {
   /// Print \p Format, instantiated with \p Args to stderr, but colored.
   /// TODO: Allow redirection into a file stream.
   template 
-  [[gnu::format(__printf__, 2, 3)]] static void
-  print(ColorTy Color, const char *Format, ArgsTy &&...Args) {
+#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958
+  [[gnu::format(__printf__, 2, 3)]]
+#endif
+  static void print(ColorTy Color, const char *Format, ArgsTy &&...Args) {
 raw_fd_ostream OS(STDERR_FILENO, false);
 WithColor(OS, HighlightColor(Color)) << llvm::format(Format, Args...);
   }
@@ -99,8 +103,10 @@ class ErrorReporter {
   /// a banner.
   /// TODO: Allow redirection into a file stream.
   template 
-  [[gnu::format(__printf__, 1, 2)]] static void reportError(const char *Format,
-ArgsTy &&...Args) {
+#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958
+  [[gnu::format(__printf__, 1, 2)]]
+#endif
+  static void reportError(const char *Format, ArgsTy &&...Args) {
 print(BoldRed, "%s", ErrorBanner);
 print(BoldRed, Format, Args...);
 print("\n");
diff --git a/offload/test/CMakeLists.txt b/offload/test/CMakeLists.txt
index 8a827e0a625eff0..4768d9ccf223bb4 100644
--- a/offload/test/CMakeLists.txt
+++ b/offload/test/CMakeLists.txt
@@ -1,6 +1,6 @@
 # CMakeLists.txt file for unit testing OpenMP offloading runtime library.
-if(NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang" OR
-   CMAKE_CXX_COMPILER_VERSION VERSION_LESS 6.0.0)
+if(NOT OPENMP_TEST_COMPILER_ID STREQUAL "Clang" OR
+   OPENMP_TEST_COMPILER_VERSION VERSION_LESS 6.0.0)
   message(STATUS "Can only test with Clang compiler in version 6.0.0 or 
later.")
   message(WARNING "The check-offload target will not be available!")
   return()

``




https://github.com/llvm/llvm-project/pull/125498
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)

2025-02-03 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.


https://github.com/llvm/llvm-project/pull/125515
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [MC/DC] Enable usage of `!` among `&&` and `||` (PR #125406)

2025-02-03 Thread NAKAMURA Takumi via llvm-branch-commits

https://github.com/chapuni updated 
https://github.com/llvm/llvm-project/pull/125406

>From f2cf50e10b59d7d461967baef4d589c9282d0f6d Mon Sep 17 00:00:00 2001
From: NAKAMURA Takumi 
Date: Sun, 2 Feb 2025 22:11:51 +0900
Subject: [PATCH 1/2] [MC/DC] Enable usage of `!` among `&&` and `||`

In the current implementation, `!(a || b) && c` was not treated as one
Decision with three terms.

Fixes #124563
---
 clang/include/clang/AST/IgnoreExpr.h  |  8 ++
 clang/lib/CodeGen/CodeGenFunction.cpp | 12 ++-
 clang/lib/CodeGen/CodeGenPGO.cpp  |  8 +-
 clang/lib/CodeGen/CoverageMappingGen.cpp  | 12 +++
 .../test/CoverageMapping/mcdc-nested-expr.cpp |  6 +-
 clang/test/Profile/c-mcdc-not.c   | 95 +++
 6 files changed, 132 insertions(+), 9 deletions(-)

diff --git a/clang/include/clang/AST/IgnoreExpr.h 
b/clang/include/clang/AST/IgnoreExpr.h
index 917bada61fa6fdd..c48c0c0daf81517 100644
--- a/clang/include/clang/AST/IgnoreExpr.h
+++ b/clang/include/clang/AST/IgnoreExpr.h
@@ -134,6 +134,14 @@ inline Expr 
*IgnoreElidableImplicitConstructorSingleStep(Expr *E) {
   return E;
 }
 
+inline Expr *IgnoreUOpLNotSingleStep(Expr *E) {
+  if (auto *UO = dyn_cast(E)) {
+if (UO->getOpcode() == UO_LNot)
+  return UO->getSubExpr();
+  }
+  return E;
+}
+
 inline Expr *IgnoreImplicitAsWrittenSingleStep(Expr *E) {
   if (auto *ICE = dyn_cast(E))
 return ICE->getSubExprAsWritten();
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index bbef277a524480b..2c380ac926b1e70 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -27,6 +27,7 @@
 #include "clang/AST/Decl.h"
 #include "clang/AST/DeclCXX.h"
 #include "clang/AST/Expr.h"
+#include "clang/AST/IgnoreExpr.h"
 #include "clang/AST/StmtCXX.h"
 #include "clang/AST/StmtObjC.h"
 #include "clang/Basic/Builtins.h"
@@ -1748,12 +1749,13 @@ bool 
CodeGenFunction::ConstantFoldsToSimpleInteger(const Expr *Cond,
 
 /// Strip parentheses and simplistic logical-NOT operators.
 const Expr *CodeGenFunction::stripCond(const Expr *C) {
-  while (const UnaryOperator *Op = dyn_cast(C->IgnoreParens())) 
{
-if (Op->getOpcode() != UO_LNot)
-  break;
-C = Op->getSubExpr();
+  while (true) {
+const Expr *SC =
+IgnoreExprNodes(C, IgnoreParensSingleStep, IgnoreUOpLNotSingleStep);
+if (C == SC)
+  return SC;
+C = SC;
   }
-  return C->IgnoreParens();
 }
 
 /// Determine whether the given condition is an instrumentable condition
diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp
index 792373839107f0a..0fd49b880bba305 100644
--- a/clang/lib/CodeGen/CodeGenPGO.cpp
+++ b/clang/lib/CodeGen/CodeGenPGO.cpp
@@ -247,8 +247,9 @@ struct MapRegionCounters : public 
RecursiveASTVisitor {
 }
 
 if (const Expr *E = dyn_cast(S)) {
-  const BinaryOperator *BinOp = 
dyn_cast(E->IgnoreParens());
-  if (BinOp && BinOp->isLogicalOp()) {
+  if (const auto *BinOp =
+  dyn_cast(CodeGenFunction::stripCond(E));
+  BinOp && BinOp->isLogicalOp()) {
 /// Check for "split-nested" logical operators. This happens when a new
 /// boolean expression logical-op nest is encountered within an 
existing
 /// boolean expression, separated by a non-logical operator.  For
@@ -280,7 +281,8 @@ struct MapRegionCounters : public 
RecursiveASTVisitor {
   return true;
 
 if (const Expr *E = dyn_cast(S)) {
-  const BinaryOperator *BinOp = 
dyn_cast(E->IgnoreParens());
+  const BinaryOperator *BinOp =
+  dyn_cast(CodeGenFunction::stripCond(E));
   if (BinOp && BinOp->isLogicalOp()) {
 assert(LogOpStack.back() == BinOp);
 LogOpStack.pop_back();
diff --git a/clang/lib/CodeGen/CoverageMappingGen.cpp 
b/clang/lib/CodeGen/CoverageMappingGen.cpp
index f09157771d2b5c0..9bf73cf27a5fa9a 100644
--- a/clang/lib/CodeGen/CoverageMappingGen.cpp
+++ b/clang/lib/CodeGen/CoverageMappingGen.cpp
@@ -799,6 +799,12 @@ struct MCDCCoverageBuilder {
   /// Return the LHS Decision ([0,0] if not set).
   const mcdc::ConditionIDs &back() const { return DecisionStack.back(); }
 
+  void swapConds() {
+if (DecisionStack.empty())
+  return;
+
+std::swap(DecisionStack.back()[false], DecisionStack.back()[true]);
+  }
   /// Push the binary operator statement to track the nest level and assign IDs
   /// to the operator's LHS and RHS.  The RHS may be a larger subtree that is
   /// broken up on successive levels.
@@ -2241,6 +2247,12 @@ struct CounterCoverageMappingBuilder
 SM.isInSystemHeader(SM.getSpellingLoc(E->getEndLoc(;
   }
 
+  void VisitUnaryLNot(const UnaryOperator *E) {
+MCDCBuilder.swapConds();
+Visit(E->getSubExpr());
+MCDCBuilder.swapConds();
+  }
+
   void VisitBinLAnd(const BinaryOperator *E) {
 if (isExprInSystemHeader(E)) {
   LeafExprSet.insert(E);
diff --git a/clang/test/CoverageMapping/mcdc-nested-expr.cp

[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-powerpc

Author: Matt Arsenault (arsenm)


Changes

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.

---

Patch is 76.32 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/125535.diff


9 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (-24) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (-5) 
- (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+22-22) 
- (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+18-18) 
- (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+91-91) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i32.ll (+42-43) 
- (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-fold-reg-sequence-subreg.mir 
(+4-4) 
- (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+239-237) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 6fc57dec6a8264..71c720ed09b5fb 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned 
OpType) const {
  OpType <= AMDGPU::OPERAND_SRC_LAST;
 }
 
-bool SIRegisterInfo::shouldRewriteCopySrc(
-  const TargetRegisterClass *DefRC,
-  unsigned DefSubReg,
-  const TargetRegisterClass *SrcRC,
-  unsigned SrcSubReg) const {
-  // We want to prefer the smallest register class possible, so we don't want 
to
-  // stop and rewrite on anything that looks like a subregister
-  // extract. Operations mostly don't care about the super register class, so 
we
-  // only want to stop on the most basic of copies between the same register
-  // class.
-  //
-  // e.g. if we have something like
-  // %0 = ...
-  // %1 = ...
-  // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2
-  // %3 = COPY %2, sub0
-  //
-  // We want to look through the COPY to find:
-  //  => %3 = COPY %0
-
-  // Plain copy.
-  return getCommonSubClass(DefRC, SrcRC) != nullptr;
-}
-
 bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const {
   // TODO: 64-bit operands have extending behavior from 32-bit literal.
   return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST &&
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 8e481e3ac23043..a434efb70d0525 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
const TargetRegisterClass *SubRC,
unsigned SubIdx) const;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-unsigned DefSubReg,
-const TargetRegisterClass *SrcRC,
-unsigned SrcSubReg) const override;
-
   /// \returns True if operands defined with this operand type can accept
   /// a literal constant (i.e. any 32-bit immediate).
   bool opCanUseLiteralConstant(unsigned OpType) const;
diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll 
b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
index c6c0b9cf8f027f..cc2f775ff22bc5 100644
--- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
+++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
@@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr 
addrspace(1) %out0, ptr addrspa
 define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr 
addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr 
addrspace(1) %in) nounwind {
 ; SI-LABEL: test_copy_v4i8_x4:
 ; SI:   ; %bb.0:
-; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11
-; SI-NEXT:s_mov_b32 s3, 0xf000
-; SI-NEXT:s_mov_b32 s10, 0
-; SI-NEXT:s_mov_b32 s11, s3
+; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11
+; SI-NEXT:s_mov_b32 s11, 0xf000
+; SI-NEXT:s_mov_b32 s2, 0
+; SI-NEXT:s_mov_b32 s3, s11
 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; SI-NEXT:v_mov_b32_e32 v1, 0
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
-; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9
-; SI-NEXT:s_mov_b32 s2, -1
-; SI-NEXT:s_mov_b32 s14, s2
-; SI-NEXT:s_mov_b32 s15, s3
-; SI-NEXT:s_mov_b32 s18, s2
+; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64
+; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9
+; SI-NEXT:s_mov_b32 s10, -1
+; SI-NEXT:s_mov_b32 s14, s10
+; SI-NEXT:s_mov_b32 s15, s11
+; SI-NEXT:s_mov_b32 s18, s10
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:s_mov_b32 s0, s4
-; SI-NEXT:s_mov_b32 s1, s5
-; SI-NEXT:s_mov_b32 s19, s3
-; SI-NE

[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Matt Arsenault (arsenm)


Changes

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.

---

Patch is 76.32 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/125535.diff


9 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (-24) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (-5) 
- (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+22-22) 
- (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+18-18) 
- (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+91-91) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i32.ll (+42-43) 
- (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-fold-reg-sequence-subreg.mir 
(+4-4) 
- (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+239-237) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 6fc57dec6a8264..71c720ed09b5fb 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned 
OpType) const {
  OpType <= AMDGPU::OPERAND_SRC_LAST;
 }
 
-bool SIRegisterInfo::shouldRewriteCopySrc(
-  const TargetRegisterClass *DefRC,
-  unsigned DefSubReg,
-  const TargetRegisterClass *SrcRC,
-  unsigned SrcSubReg) const {
-  // We want to prefer the smallest register class possible, so we don't want 
to
-  // stop and rewrite on anything that looks like a subregister
-  // extract. Operations mostly don't care about the super register class, so 
we
-  // only want to stop on the most basic of copies between the same register
-  // class.
-  //
-  // e.g. if we have something like
-  // %0 = ...
-  // %1 = ...
-  // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2
-  // %3 = COPY %2, sub0
-  //
-  // We want to look through the COPY to find:
-  //  => %3 = COPY %0
-
-  // Plain copy.
-  return getCommonSubClass(DefRC, SrcRC) != nullptr;
-}
-
 bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const {
   // TODO: 64-bit operands have extending behavior from 32-bit literal.
   return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST &&
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 8e481e3ac23043..a434efb70d0525 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
const TargetRegisterClass *SubRC,
unsigned SubIdx) const;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-unsigned DefSubReg,
-const TargetRegisterClass *SrcRC,
-unsigned SrcSubReg) const override;
-
   /// \returns True if operands defined with this operand type can accept
   /// a literal constant (i.e. any 32-bit immediate).
   bool opCanUseLiteralConstant(unsigned OpType) const;
diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll 
b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
index c6c0b9cf8f027f..cc2f775ff22bc5 100644
--- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
+++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
@@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr 
addrspace(1) %out0, ptr addrspa
 define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr 
addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr 
addrspace(1) %in) nounwind {
 ; SI-LABEL: test_copy_v4i8_x4:
 ; SI:   ; %bb.0:
-; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11
-; SI-NEXT:s_mov_b32 s3, 0xf000
-; SI-NEXT:s_mov_b32 s10, 0
-; SI-NEXT:s_mov_b32 s11, s3
+; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11
+; SI-NEXT:s_mov_b32 s11, 0xf000
+; SI-NEXT:s_mov_b32 s2, 0
+; SI-NEXT:s_mov_b32 s3, s11
 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; SI-NEXT:v_mov_b32_e32 v1, 0
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
-; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9
-; SI-NEXT:s_mov_b32 s2, -1
-; SI-NEXT:s_mov_b32 s14, s2
-; SI-NEXT:s_mov_b32 s15, s3
-; SI-NEXT:s_mov_b32 s18, s2
+; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64
+; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9
+; SI-NEXT:s_mov_b32 s10, -1
+; SI-NEXT:s_mov_b32 s14, s10
+; SI-NEXT:s_mov_b32 s15, s11
+; SI-NEXT:s_mov_b32 s18, s10
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:s_mov_b32 s0, s4
-; SI-NEXT:s_mov_b32 s1, s5
-; SI-NEXT:s_mov_b32 s19, s3
-; SI-NEXT:s_mov_b32 s22, s2
-; SI-NEXT:s_mov_b32 s23, s3
-; SI-NEXT

[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/125535

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.

>From e5479afa758aadd545028780e8a5ab3bd119e028 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 23 Jan 2025 14:39:10 +0700
Subject: [PATCH] AMDGPU: Use default shouldRewriteCopySrc

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.
---
 llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp |  24 -
 llvm/lib/Target/AMDGPU/SIRegisterInfo.h   |   5 -
 llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll |  44 +-
 llvm/test/CodeGen/AMDGPU/ctpop64.ll   |  36 +-
 llvm/test/CodeGen/AMDGPU/idot2.ll | 182 +++
 llvm/test/CodeGen/AMDGPU/load-global-i32.ll   |  85 ++--
 .../peephole-opt-fold-reg-sequence-subreg.mir |   8 +-
 .../AMDGPU/peephole-opt-regseq-removal.mir|   4 +-
 .../CodeGen/AMDGPU/spill-scavenge-offset.ll   | 476 +-
 9 files changed, 418 insertions(+), 446 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 6fc57dec6a8264..71c720ed09b5fb 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned 
OpType) const {
  OpType <= AMDGPU::OPERAND_SRC_LAST;
 }
 
-bool SIRegisterInfo::shouldRewriteCopySrc(
-  const TargetRegisterClass *DefRC,
-  unsigned DefSubReg,
-  const TargetRegisterClass *SrcRC,
-  unsigned SrcSubReg) const {
-  // We want to prefer the smallest register class possible, so we don't want 
to
-  // stop and rewrite on anything that looks like a subregister
-  // extract. Operations mostly don't care about the super register class, so 
we
-  // only want to stop on the most basic of copies between the same register
-  // class.
-  //
-  // e.g. if we have something like
-  // %0 = ...
-  // %1 = ...
-  // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2
-  // %3 = COPY %2, sub0
-  //
-  // We want to look through the COPY to find:
-  //  => %3 = COPY %0
-
-  // Plain copy.
-  return getCommonSubClass(DefRC, SrcRC) != nullptr;
-}
-
 bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const {
   // TODO: 64-bit operands have extending behavior from 32-bit literal.
   return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST &&
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 8e481e3ac23043..a434efb70d0525 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
const TargetRegisterClass *SubRC,
unsigned SubIdx) const;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-unsigned DefSubReg,
-const TargetRegisterClass *SrcRC,
-unsigned SrcSubReg) const override;
-
   /// \returns True if operands defined with this operand type can accept
   /// a literal constant (i.e. any 32-bit immediate).
   bool opCanUseLiteralConstant(unsigned OpType) const;
diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll 
b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
index c6c0b9cf8f027f..cc2f775ff22bc5 100644
--- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
+++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
@@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr 
addrspace(1) %out0, ptr addrspa
 define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr 
addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr 
addrspace(1) %in) nounwind {
 ; SI-LABEL: test_copy_v4i8_x4:
 ; SI:   ; %bb.0:
-; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11
-; SI-NEXT:s_mov_b32 s3, 0xf000
-; SI-NEXT:s_mov_b32 s10, 0
-; SI-NEXT:s_mov_b32 s11, s3
+; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11
+; SI-NEXT:s_mov_b32 s11, 0xf000
+; SI-NEXT:s_mov_b32 s2, 0
+; SI-NEXT:s_mov_b32 s3, s11
 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; SI-NEXT:v_mov_b32_e32 v1, 0
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
-; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9
-; SI-NEXT:s_mov_b32 s2, -1
-; SI-NEXT:s_mov_b32 s14, s2
-; SI-NEXT:s_mov_b32 s15, s3
-; SI-NEXT:s_mov_b32 s18, s2
+; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64
+; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9
+; SI-NEXT:s_mov_b32 s10, -1
+; SI-NEXT:s_mov_b32 s14, s10
+; SI-NEXT:s_mov_b32 s15, s11
+; SI-NEXT:s_mov_b32 s18, s10
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:s_mo

[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/125535?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#125535** https://app.graphite.dev/github/pr/llvm/llvm-project/125535?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/125535?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#125533** https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125224** https://app.graphite.dev/github/pr/llvm/llvm-project/125224?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/125535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)


Changes

This was ultimately working around bugs in subregister handling
in peephole-opt. In the common case, it would give up on folding
anything into a subregister extract copy.

---

Patch is 76.32 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/125535.diff


9 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (-24) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (-5) 
- (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+22-22) 
- (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+18-18) 
- (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+91-91) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i32.ll (+42-43) 
- (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-fold-reg-sequence-subreg.mir 
(+4-4) 
- (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+239-237) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 6fc57dec6a8264..71c720ed09b5fb 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned 
OpType) const {
  OpType <= AMDGPU::OPERAND_SRC_LAST;
 }
 
-bool SIRegisterInfo::shouldRewriteCopySrc(
-  const TargetRegisterClass *DefRC,
-  unsigned DefSubReg,
-  const TargetRegisterClass *SrcRC,
-  unsigned SrcSubReg) const {
-  // We want to prefer the smallest register class possible, so we don't want 
to
-  // stop and rewrite on anything that looks like a subregister
-  // extract. Operations mostly don't care about the super register class, so 
we
-  // only want to stop on the most basic of copies between the same register
-  // class.
-  //
-  // e.g. if we have something like
-  // %0 = ...
-  // %1 = ...
-  // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2
-  // %3 = COPY %2, sub0
-  //
-  // We want to look through the COPY to find:
-  //  => %3 = COPY %0
-
-  // Plain copy.
-  return getCommonSubClass(DefRC, SrcRC) != nullptr;
-}
-
 bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const {
   // TODO: 64-bit operands have extending behavior from 32-bit literal.
   return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST &&
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 8e481e3ac23043..a434efb70d0525 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
const TargetRegisterClass *SubRC,
unsigned SubIdx) const;
 
-  bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
-unsigned DefSubReg,
-const TargetRegisterClass *SrcRC,
-unsigned SrcSubReg) const override;
-
   /// \returns True if operands defined with this operand type can accept
   /// a literal constant (i.e. any 32-bit immediate).
   bool opCanUseLiteralConstant(unsigned OpType) const;
diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll 
b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
index c6c0b9cf8f027f..cc2f775ff22bc5 100644
--- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
+++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
@@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr 
addrspace(1) %out0, ptr addrspa
 define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr 
addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr 
addrspace(1) %in) nounwind {
 ; SI-LABEL: test_copy_v4i8_x4:
 ; SI:   ; %bb.0:
-; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11
-; SI-NEXT:s_mov_b32 s3, 0xf000
-; SI-NEXT:s_mov_b32 s10, 0
-; SI-NEXT:s_mov_b32 s11, s3
+; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11
+; SI-NEXT:s_mov_b32 s11, 0xf000
+; SI-NEXT:s_mov_b32 s2, 0
+; SI-NEXT:s_mov_b32 s3, s11
 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0
 ; SI-NEXT:v_mov_b32_e32 v1, 0
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
-; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9
-; SI-NEXT:s_mov_b32 s2, -1
-; SI-NEXT:s_mov_b32 s14, s2
-; SI-NEXT:s_mov_b32 s15, s3
-; SI-NEXT:s_mov_b32 s18, s2
+; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64
+; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9
+; SI-NEXT:s_mov_b32 s10, -1
+; SI-NEXT:s_mov_b32 s14, s10
+; SI-NEXT:s_mov_b32 s15, s11
+; SI-NEXT:s_mov_b32 s18, s10
 ; SI-NEXT:s_waitcnt lgkmcnt(0)
-; SI-NEXT:s_mov_b32 s0, s4
-; SI-NEXT:s_mov_b32 s1, s5
-; SI-NEXT:s_mov_b32 s19, s3
-; SI-NEXT:s_mov_b32 s22, s2
-; SI-NEXT:s_mov_b32 s23, s3
-; SI-NEXT:

[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/125533
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-regalloc

Author: Matt Arsenault (arsenm)


Changes

This fixes the handling of subregister extract copies. This
will allow AMDGPU to remove its implementation of
shouldRewriteCopySrc, which exists as a 10 year old workaround
to this bug. peephole-opt-fold-reg-sequence-subreg.mir will
show the expected improvement once the custom implementation
is removed.

The copy coalescing processing here is overly abstracted
from what's actually happening. Previously when visiting
coalescable copy-like instructions, we would parse the
sources one at a time and then pass the def of the root
instruction into findNextSource. This means that the
first thing the new ValueTracker constructed would do
is getVRegDef to find the instruction we are currently
processing. This adds an unnecessary step, placing
a useless entry in the RewriteMap, and required skipping
the no-op case where getNewSource would return the original
source operand. This was a problem since in the case
of a subregister extract, shouldRewriteCopySource would always
say that it is useful to rewrite and the use-def chain walk
would abort, returning the original operand. Move the process
to start looking at the source operand to begin with.

This does not fix the confused handling in the uncoalescable
copy case which is proving to be more difficult. Some currently
handled cases have multiple defs from a single source, and other
handled cases have 0 input operands. It would be simpler if
this was implemented with isCopyLikeInstr, rather than guessing
at the operand structure as it does now.

There are some improvements and some regressions. The
regressions appear to be downstream issues for the most part. One
of the uglier regressions is in PPC, where a sequence of insert_subrgs
is used to build registers. I opened #125502 to use reg_sequence 
instead,
which may help.

The worst regression is an absurd SPARC testcase using a <251 x fp128>,
which uses a very long chain of insert_subregs.

We need improved subregister handling locally in PeepholeOptimizer,
and other pasess like MachineCSE to fix some of the other regressions.
We should handle subregister composes and folding more indexes
into insert_subreg and reg_sequence.

---

Patch is 475.60 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/125533.diff


105 Files Affected:

- (modified) llvm/lib/CodeGen/PeepholeOptimizer.cpp (+28-12) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc3.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8a.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-lse2.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc3.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-v8a.ll 
(+30-30) 
- (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll (-4) 
- (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll (+34-40) 
- (modified) llvm/test/CodeGen/AArch64/addsub_ext.ll (+4-22) 
- (modified) llvm/test/CodeGen/AArch64/and-mask-removal.ll (-1) 
- (modified) llvm/test/CodeGen/AArch64/arm64-ldxr-stxr.ll (-4) 
- (modified) llvm/test/CodeGen/AArch64/arm64-vaddv.ll (-1) 
- (modified) llvm/test/CodeGen/AArch64/arm64_32-addrs.ll (-1) 
- (modified) llvm/test/CodeGen/AArch64/atomic-ops-msvc.ll (+4-7) 
- (modified) llvm/test/CodeGen/AArch64/atomic-ops.ll (-4) 
- (modified) llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll (+10-12) 
- (modified) llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll (+10-12) 
- (modified) llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll (+10-12) 
- (modified) llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll (+10-12) 
- (modified) llvm/test/CodeGen/AArch64/atomicrmw-xchg-fp.ll (+5-5) 
- (modified) llvm/test/CodeGen/AArch64/cmp-to-cmn.ll (-8) 
- (modified) llvm/test/CodeGen/AArch64/cmpxchg-idioms.ll (-1) 
- (modified) llvm/test/CodeGen/AArch64/extract-bits.ll (-6) 
- (modified) llvm/test/CodeGen/AArch64/fold-int-pow2-with-fmul-or-fdiv.ll (-2) 
- (modified) llvm/test/CodeGen/AArch64/fsh.ll (-2) 
- (modified) llvm/test/CodeGen/AArch64/funnel-shift.ll (-4) 
- (modified) 
llvm/test/CodeGen/AArch64/hoist-and-by-const-from-lshr-in-eqcmp-zero.ll (-8) 
- (modified) 
llvm/test/CodeGen/AArch64/hoist-and-by-const-from-shl-in-eqcmp-zero.ll (+5-14) 
- (modified) llvm/test/CodeGen/AArch64/logic-shift.ll (-9) 
- (modified) llvm/test/CodeGen/AArch64/neon-insextbitcast.ll (-2) 
- (modified) llvm/test/CodeGen/AArch64/shift-by-signext.ll (-2) 
- (modified) llvm/test/CodeGen/AArch64/shift.ll (-6) 
- (modified) llvm/test/CodeGen/AArch64/sink-and-fold.ll (-1) 
- (modified) llvm/test/CodeGen/AArch64/sve-fixed-lengt

[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#125533** https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#125224** https://app.graphite.dev/github/pr/llvm/llvm-project/125224?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/125533
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)

2025-02-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/125535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/125515
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)

2025-02-03 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/125515

Backport cb2598dda1aae5096a77bc8a9f6679ca1b350e5e

Requested by: @brad0

>From acd0c6c1774c2c2ba97c714709041f6561370447 Mon Sep 17 00:00:00 2001
From: Brad Smith 
Date: Mon, 3 Feb 2025 10:03:59 -0500
Subject: [PATCH] [flang][runtime] Make sure to link libexecinfo if it exists
 (#125344)

Fixes building the backtrace support on FreeBSD/NetBSD/OpenBSD/DragonFly and 
musl
libc with libexecinfo.

(cherry picked from commit cb2598dda1aae5096a77bc8a9f6679ca1b350e5e)
---
 flang/runtime/CMakeLists.txt | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/flang/runtime/CMakeLists.txt b/flang/runtime/CMakeLists.txt
index fbfaae9a880648..bf27a121e4d174 100644
--- a/flang/runtime/CMakeLists.txt
+++ b/flang/runtime/CMakeLists.txt
@@ -59,10 +59,15 @@ if (CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
 )
 endif()
 
+set(linked_libraries FortranDecimal)
+
 # function checks
 find_package(Backtrace)
 set(HAVE_BACKTRACE ${Backtrace_FOUND})
 set(BACKTRACE_HEADER ${Backtrace_HEADER})
+if(HAVE_BACKTRACE)
+  list(APPEND linked_libraries ${Backtrace_LIBRARY})
+endif()
 
 include(CheckCXXSymbolExists)
 include(CheckCXXSourceCompiles)
@@ -271,7 +276,7 @@ if (NOT DEFINED MSVC)
   add_flang_library(FortranRuntime
 ${sources}
 LINK_LIBS
-FortranDecimal
+${linked_libraries}
 
 INSTALL_WITH_TOOLCHAIN
   )
@@ -279,7 +284,7 @@ else()
   add_flang_library(FortranRuntime
 ${sources}
 LINK_LIBS
-FortranDecimal
+${linked_libraries}
   )
   set(CMAKE_MSVC_RUNTIME_LIBRARY MultiThreaded)
   add_flang_library(FortranRuntime.static ${sources}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-runtime

Author: None (llvmbot)


Changes

Backport cb2598dda1aae5096a77bc8a9f6679ca1b350e5e

Requested by: @brad0

---
Full diff: https://github.com/llvm/llvm-project/pull/125515.diff


1 Files Affected:

- (modified) flang/runtime/CMakeLists.txt (+7-2) 


``diff
diff --git a/flang/runtime/CMakeLists.txt b/flang/runtime/CMakeLists.txt
index fbfaae9a8806486..bf27a121e4d174c 100644
--- a/flang/runtime/CMakeLists.txt
+++ b/flang/runtime/CMakeLists.txt
@@ -59,10 +59,15 @@ if (CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
 )
 endif()
 
+set(linked_libraries FortranDecimal)
+
 # function checks
 find_package(Backtrace)
 set(HAVE_BACKTRACE ${Backtrace_FOUND})
 set(BACKTRACE_HEADER ${Backtrace_HEADER})
+if(HAVE_BACKTRACE)
+  list(APPEND linked_libraries ${Backtrace_LIBRARY})
+endif()
 
 include(CheckCXXSymbolExists)
 include(CheckCXXSourceCompiles)
@@ -271,7 +276,7 @@ if (NOT DEFINED MSVC)
   add_flang_library(FortranRuntime
 ${sources}
 LINK_LIBS
-FortranDecimal
+${linked_libraries}
 
 INSTALL_WITH_TOOLCHAIN
   )
@@ -279,7 +284,7 @@ else()
   add_flang_library(FortranRuntime
 ${sources}
 LINK_LIBS
-FortranDecimal
+${linked_libraries}
   )
   set(CMAKE_MSVC_RUNTIME_LIBRARY MultiThreaded)
   add_flang_library(FortranRuntime.static ${sources}

``




https://github.com/llvm/llvm-project/pull/125515
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:

@tblah What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/125515
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)

2025-02-03 Thread Jon Roelofs via llvm-branch-commits

https://github.com/jroelofs approved this pull request.


https://github.com/llvm/llvm-project/pull/125525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [MC/DC] Enable nested expressions (PR #125413)

2025-02-03 Thread NAKAMURA Takumi via llvm-branch-commits

https://github.com/chapuni updated 
https://github.com/llvm/llvm-project/pull/125413

>From c56ecc30e9fd1a674073e362fbfcc6b43f2f52e2 Mon Sep 17 00:00:00 2001
From: NAKAMURA Takumi 
Date: Sun, 2 Feb 2025 22:06:32 +0900
Subject: [PATCH 1/2] [MC/DC] Enable nested expressions

A warning "contains an operation with a nested boolean expression." is
no longer emitter. At the moment, split expressions are treated as
individual Decisions.
---
 clang/lib/CodeGen/CodeGenPGO.cpp  | 150 ++
 .../test/CoverageMapping/mcdc-nested-expr.cpp |  30 +++-
 .../Frontend/custom-diag-werror-interaction.c |   4 +-
 3 files changed, 109 insertions(+), 75 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp
index 7d16f673ada419e..0c3973aa4dccfdc 100644
--- a/clang/lib/CodeGen/CodeGenPGO.cpp
+++ b/clang/lib/CodeGen/CodeGenPGO.cpp
@@ -228,10 +228,17 @@ struct MapRegionCounters : public 
RecursiveASTVisitor {
   /// The stacks are also used to find error cases and notify the user.  A
   /// standard logical operator nest for a boolean expression could be in a 
form
   /// similar to this: "x = a && b && c && (d || f)"
-  unsigned NumCond = 0;
-  bool SplitNestedLogicalOp = false;
-  SmallVector NonLogOpStack;
-  SmallVector LogOpStack;
+  struct DecisionState {
+llvm::DenseSet Leaves; // Not BinOp
+const Expr *DecisionExpr;// Root
+bool Split;
+
+DecisionState() = delete;
+DecisionState(const Expr *E, bool Split = false)
+: DecisionExpr(E), Split(Split) {}
+  };
+
+  SmallVector DecisionStack;
 
   // Hook: dataTraverseStmtPre() is invoked prior to visiting an AST Stmt node.
   bool dataTraverseStmtPre(Stmt *S) {
@@ -239,34 +246,28 @@ struct MapRegionCounters : public 
RecursiveASTVisitor {
 if (MCDCMaxCond == 0)
   return true;
 
-/// At the top of the logical operator nest, reset the number of 
conditions,
-/// also forget previously seen split nesting cases.
-if (LogOpStack.empty()) {
-  NumCond = 0;
-  SplitNestedLogicalOp = false;
-}
-
-if (const Expr *E = dyn_cast(S)) {
-  const BinaryOperator *BinOp = 
dyn_cast(E->IgnoreParens());
-  if (BinOp && BinOp->isLogicalOp()) {
-/// Check for "split-nested" logical operators. This happens when a new
-/// boolean expression logical-op nest is encountered within an 
existing
-/// boolean expression, separated by a non-logical operator.  For
-/// example, in "x = (a && b && c && foo(d && f))", the "d && f" case
-/// starts a new boolean expression that is separated from the other
-/// conditions by the operator foo(). Split-nested cases are not
-/// supported by MC/DC.
-SplitNestedLogicalOp = SplitNestedLogicalOp || !NonLogOpStack.empty();
-
-LogOpStack.push_back(BinOp);
+/// Mark "in splitting" when a leaf is met.
+if (!DecisionStack.empty()) {
+  auto &StackTop = DecisionStack.back();
+  if (!StackTop.Split) {
+if (StackTop.Leaves.contains(S)) {
+  assert(!StackTop.Split);
+  StackTop.Split = true;
+}
 return true;
   }
+
+  // Split
+  assert(StackTop.Split);
+  assert(!StackTop.Leaves.contains(S));
 }
 
-/// Keep track of non-logical operators. These are OK as long as we don't
-/// encounter a new logical operator after seeing one.
-if (!LogOpStack.empty())
-  NonLogOpStack.push_back(S);
+if (const auto *E = dyn_cast(S)) {
+  if (const auto *BinOp =
+  dyn_cast(CodeGenFunction::stripCond(E));
+  BinOp && BinOp->isLogicalOp())
+DecisionStack.emplace_back(E);
+}
 
 return true;
   }
@@ -275,49 +276,57 @@ struct MapRegionCounters : public 
RecursiveASTVisitor {
   // an AST Stmt node.  MC/DC will use it to to signal when the top of a
   // logical operation (boolean expression) nest is encountered.
   bool dataTraverseStmtPost(Stmt *S) {
-/// If MC/DC is not enabled, MCDCMaxCond will be set to 0. Do nothing.
-if (MCDCMaxCond == 0)
+if (DecisionStack.empty())
   return true;
 
-if (const Expr *E = dyn_cast(S)) {
-  const BinaryOperator *BinOp = 
dyn_cast(E->IgnoreParens());
-  if (BinOp && BinOp->isLogicalOp()) {
-assert(LogOpStack.back() == BinOp);
-LogOpStack.pop_back();
-
-/// At the top of logical operator nest:
-if (LogOpStack.empty()) {
-  /// Was the "split-nested" logical operator case encountered?
-  if (SplitNestedLogicalOp) {
-unsigned DiagID = Diag.getCustomDiagID(
-DiagnosticsEngine::Warning,
-"unsupported MC/DC boolean expression; "
-"contains an operation with a nested boolean expression. "
-"Expression will not be covered");
-Diag.Report(S->getBeginLoc(), DiagID);
-return true;
-  }
-
-  /// Was the maximum number of conditions en

[llvm-branch-commits] [clang] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions (PR #125484)

2025-02-03 Thread NAKAMURA Takumi via llvm-branch-commits

https://github.com/chapuni created 
https://github.com/llvm/llvm-project/pull/125484

`-fmcdc-single-conditions` is `CC1Option` for now.

This change discovers `isInstrumentedCondition(Cond)` on 
`DoStmt/ForStmt/IfStmt/WhleStmt/AbstractConditionalOperator` and add them into 
Decisions.

An example of the report:

```
MC/DC Decision Region (mmm:nn) to (mmm:nn)

  Number of Conditions: 1
 Condition C1 -->(mmm:nn)

  Executed MC/DC Test Vectors:

 C1Result
  1 { F  = F  }
  2 { T  = T  }

  C1-Pair: covered: (1,2)
  MC/DC Coverage for Expression: 100.00%
```

The Decision is covered only if both `true` and `false` are covered.

Fixes #95336

>From af336315f37021ccc6d21059ecfe28a0f30248ff Mon Sep 17 00:00:00 2001
From: NAKAMURA Takumi 
Date: Mon, 3 Feb 2025 20:35:06 +0900
Subject: [PATCH] [MC/DC] Introduce `-fmcdc-single-conditions` to include also
 single conditions

`-fmcdc-single-conditions` is `CC1Option` for now.

This change discovers `isInstrumentedCondition(Cond)` on
`DoStmt/ForStmt/IfStmt/WhleStmt/AbstractConditionalOperator` and add
them into Decisions.

An example of the report:

```
MC/DC Decision Region (mmm:nn) to (mmm:nn)

  Number of Conditions: 1
 Condition C1 -->(mmm:nn)

  Executed MC/DC Test Vectors:

 C1Result
  1 { F  = F  }
  2 { T  = T  }

  C1-Pair: covered: (1,2)
  MC/DC Coverage for Expression: 100.00%
```

The Decision is covered only if both `true` and `false` are covered.

Fixes #95336
---
 clang/docs/ReleaseNotes.rst   |  3 +
 clang/docs/SourceBasedCodeCoverage.rst|  4 +
 clang/include/clang/Basic/CodeGenOptions.def  |  1 +
 clang/include/clang/Driver/Options.td |  4 +
 clang/lib/CodeGen/CGExpr.cpp  | 32 +--
 clang/lib/CodeGen/CodeGenFunction.h   |  4 +-
 clang/lib/CodeGen/CodeGenPGO.cpp  | 38 -
 clang/lib/CodeGen/CoverageMappingGen.cpp  | 46 +++---
 .../test/CoverageMapping/mcdc-single-cond.cpp | 85 ++-
 9 files changed, 190 insertions(+), 27 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 42054fe27c5ee1c..4138fc2f11e0c17 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -118,6 +118,9 @@ Improvements to Coverage Mapping
 
 - [MC/DC] Nested expressions are handled as individual MC/DC expressions.
 
+- [MC/DC] Non-boolean expressions on conditions can be included with
+  `-fmcdc-single-conditions`. (#GH95336)
+
 Bug Fixes in This Version
 -
 
diff --git a/clang/docs/SourceBasedCodeCoverage.rst 
b/clang/docs/SourceBasedCodeCoverage.rst
index d26babe829ab5be..bcd4ae0e9748d15 100644
--- a/clang/docs/SourceBasedCodeCoverage.rst
+++ b/clang/docs/SourceBasedCodeCoverage.rst
@@ -510,6 +510,10 @@ requires 8 test vectors.
 Expressions such as ``((a0 && b0) || (a1 && b1) || ...)`` can cause the
 number of test vectors to increase exponentially.
 
+Clang handles only binary logical operators as MC/DC coverage. Single
+conditions without logcal operators on `do/for/while/if/?!` can be
+included with `-Xclang -fmcdc-single-conditions`.
+
 Switch statements
 -
 
diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 259972bdf8f0013..1a9ebae845619b7 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -236,6 +236,7 @@ CODEGENOPT(DumpCoverageMapping , 1, 0) ///< Dump the 
generated coverage mapping
 CODEGENOPT(MCDCCoverage , 1, 0) ///< Enable MC/DC code coverage criteria.
 VALUE_CODEGENOPT(MCDCMaxConds, 16, 32767) ///< MC/DC Maximum conditions.
 VALUE_CODEGENOPT(MCDCMaxTVs, 32, 0x7FFE) ///< MC/DC Maximum test vectors.
+VALUE_CODEGENOPT(MCDCSingleCond, 1, 0) ///< Enable MC/DC single conditions.
 
   /// If -fpcc-struct-return or -freg-struct-return is specified.
 ENUM_CODEGENOPT(StructReturnConvention, StructReturnConventionKind, 2, 
SRCK_Default)
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 6eabd9f76a792db..57b826bce6da821 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1742,6 +1742,10 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], 
"fmcdc-max-test-vectors=">,
   Group, Visibility<[CC1Option]>,
   HelpText<"Maximum number of test vectors in MC/DC coverage">,
   MarshallingInfoInt, "0x7FFE">;
+def fmcdc_single_conditions : Flag<["-"], "fmcdc-single-conditions">,
+  Group, Visibility<[CC1Option]>,
+  HelpText<"Include also single conditions as MC/DC coverage">,
+  MarshallingInfoFlag>;
 def fprofile_generate : Flag<["-"], "fprofile-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw (overridden by LLVM_PROFILE_FILE env var)">;
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 9676e61cf322d92..8

[llvm-branch-commits] [mlir] WIP: [mlir][OpenMP] Pack task private variables into a heap-allocated context struct (PR #125307)

2025-02-03 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah updated 
https://github.com/llvm/llvm-project/pull/125307

>From afa9026eefb6c9cd613ed021a92e159f93c3667c Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Fri, 24 Jan 2025 17:32:41 +
Subject: [PATCH 1/2] [mlir][OpenMP] Pack task private variables into a
 heap-allocated context struct

See RFC:
https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084

The aim here is to ensure that tasks which are not executed for a while
after they are created do not try to reference any data which are now
out of scope. This is done by packing the data referred to by the task
into a heap allocated structure (freed at the end of the task).

I decided to create the task context structure in
OpenMPToLLVMIRTranslation instead of adapting how it is done
CodeExtractor (via OpenMPIRBuilder] because CodeExtractor is (at least
in theory) generic code which could have other unrelated uses.
---
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 204 +++---
 mlir/test/Target/LLVMIR/openmp-llvm.mlir  |   5 +-
 .../LLVMIR/openmp-task-privatization.mlir |  82 +++
 3 files changed, 254 insertions(+), 37 deletions(-)
 create mode 100644 mlir/test/Target/LLVMIR/openmp-task-privatization.mlir

diff --git 
a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp 
b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 8a9a69cefad8ee1..5c4deab492c8390 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -13,6 +13,7 @@
 #include "mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h"
 #include "mlir/Analysis/TopologicalSortUtils.h"
 #include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/LLVMIR/LLVMTypes.h"
 #include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Dialect/OpenMP/OpenMPInterfaces.h"
 #include "mlir/IR/IRMapping.h"
@@ -24,10 +25,12 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/SetVector.h"
+#include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/TypeSwitch.h"
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include "llvm/Frontend/OpenMP/OMPIRBuilder.h"
 #include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/ReplaceConstant.h"
 #include "llvm/Support/FileSystem.h"
@@ -1331,19 +1334,16 @@ findAssociatedValue(Value privateVar, 
llvm::IRBuilderBase &builder,
 
 /// Initialize a single (first)private variable. You probably want to use
 /// allocateAndInitPrivateVars instead of this.
-static llvm::Error
-initPrivateVar(llvm::IRBuilderBase &builder,
-   LLVM::ModuleTranslation &moduleTranslation,
-   omp::PrivateClauseOp &privDecl, Value mlirPrivVar,
-   BlockArgument &blockArg, llvm::Value *llvmPrivateVar,
-   llvm::SmallVectorImpl &llvmPrivateVars,
-   llvm::BasicBlock *privInitBlock,
-   llvm::DenseMap *mappedPrivateVars = nullptr) {
+/// This returns the private variable which has been initialized. This
+/// variable should be mapped before constructing the body of the Op.
+static llvm::Expected initPrivateVar(
+llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation,
+omp::PrivateClauseOp &privDecl, Value mlirPrivVar, BlockArgument &blockArg,
+llvm::Value *llvmPrivateVar, llvm::BasicBlock *privInitBlock,
+llvm::DenseMap *mappedPrivateVars = nullptr) {
   Region &initRegion = privDecl.getInitRegion();
   if (initRegion.empty()) {
-moduleTranslation.mapValue(blockArg, llvmPrivateVar);
-llvmPrivateVars.push_back(llvmPrivateVar);
-return llvm::Error::success();
+return llvmPrivateVar;
   }
 
   // map initialization region block arguments
@@ -1363,17 +1363,15 @@ initPrivateVar(llvm::IRBuilderBase &builder,
 
   assert(phis.size() == 1 && "expected one allocation to be yielded");
 
-  // prefer the value yielded from the init region to the allocated private
-  // variable in case the region is operating on arguments by-value (e.g.
-  // Fortran character boxes).
-  moduleTranslation.mapValue(blockArg, phis[0]);
-  llvmPrivateVars.push_back(phis[0]);
-
   // clear init region block argument mapping in case it needs to be
   // re-created with a different source for another use of the same
   // reduction decl
   moduleTranslation.forgetMapping(initRegion);
-  return llvm::Error::success();
+
+  // Prefer the value yielded from the init region to the allocated private
+  // variable in case the region is operating on arguments by-value (e.g.
+  // Fortran character boxes).
+  return phis[0];
 }
 
 /// Allocate and initialize delayed private variables. Returns the basic block
@@ -1415,11 +1413,13 @@ static llvm::Expected 
allocateAndInitPrivateVars(
 llvm::Value *llvmPrivateVar = builder.CreateAlloca(
 llvmAllocType, /*ArraySize=*/nullptr, "omp.private.alloc");
 
-

[llvm-branch-commits] [clang] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions (PR #125484)

2025-02-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: NAKAMURA Takumi (chapuni)


Changes

`-fmcdc-single-conditions` is `CC1Option` for now.

This change discovers `isInstrumentedCondition(Cond)` on 
`DoStmt/ForStmt/IfStmt/WhleStmt/AbstractConditionalOperator` and add them into 
Decisions.

An example of the report:

```
MC/DC Decision Region (mmm:nn) to (mmm:nn)

  Number of Conditions: 1
 Condition C1 -->(mmm:nn)

  Executed MC/DC Test Vectors:

 C1Result
  1 { F  = F  }
  2 { T  = T  }

  C1-Pair: covered: (1,2)
  MC/DC Coverage for Expression: 100.00%
```

The Decision is covered only if both `true` and `false` are covered.

Fixes #95336

---

Patch is 24.76 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/125484.diff


9 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+3) 
- (modified) clang/docs/SourceBasedCodeCoverage.rst (+4) 
- (modified) clang/include/clang/Basic/CodeGenOptions.def (+1) 
- (modified) clang/include/clang/Driver/Options.td (+4) 
- (modified) clang/lib/CodeGen/CGExpr.cpp (+24-8) 
- (modified) clang/lib/CodeGen/CodeGenFunction.h (+2-2) 
- (modified) clang/lib/CodeGen/CodeGenPGO.cpp (+34-4) 
- (modified) clang/lib/CodeGen/CoverageMappingGen.cpp (+36-10) 
- (modified) clang/test/CoverageMapping/mcdc-single-cond.cpp (+82-3) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 42054fe27c5ee1..4138fc2f11e0c1 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -118,6 +118,9 @@ Improvements to Coverage Mapping
 
 - [MC/DC] Nested expressions are handled as individual MC/DC expressions.
 
+- [MC/DC] Non-boolean expressions on conditions can be included with
+  `-fmcdc-single-conditions`. (#GH95336)
+
 Bug Fixes in This Version
 -
 
diff --git a/clang/docs/SourceBasedCodeCoverage.rst 
b/clang/docs/SourceBasedCodeCoverage.rst
index d26babe829ab5b..bcd4ae0e9748d1 100644
--- a/clang/docs/SourceBasedCodeCoverage.rst
+++ b/clang/docs/SourceBasedCodeCoverage.rst
@@ -510,6 +510,10 @@ requires 8 test vectors.
 Expressions such as ``((a0 && b0) || (a1 && b1) || ...)`` can cause the
 number of test vectors to increase exponentially.
 
+Clang handles only binary logical operators as MC/DC coverage. Single
+conditions without logcal operators on `do/for/while/if/?!` can be
+included with `-Xclang -fmcdc-single-conditions`.
+
 Switch statements
 -
 
diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 259972bdf8f001..1a9ebae845619b 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -236,6 +236,7 @@ CODEGENOPT(DumpCoverageMapping , 1, 0) ///< Dump the 
generated coverage mapping
 CODEGENOPT(MCDCCoverage , 1, 0) ///< Enable MC/DC code coverage criteria.
 VALUE_CODEGENOPT(MCDCMaxConds, 16, 32767) ///< MC/DC Maximum conditions.
 VALUE_CODEGENOPT(MCDCMaxTVs, 32, 0x7FFE) ///< MC/DC Maximum test vectors.
+VALUE_CODEGENOPT(MCDCSingleCond, 1, 0) ///< Enable MC/DC single conditions.
 
   /// If -fpcc-struct-return or -freg-struct-return is specified.
 ENUM_CODEGENOPT(StructReturnConvention, StructReturnConventionKind, 2, 
SRCK_Default)
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 6eabd9f76a792d..57b826bce6da82 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1742,6 +1742,10 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], 
"fmcdc-max-test-vectors=">,
   Group, Visibility<[CC1Option]>,
   HelpText<"Maximum number of test vectors in MC/DC coverage">,
   MarshallingInfoInt, "0x7FFE">;
+def fmcdc_single_conditions : Flag<["-"], "fmcdc-single-conditions">,
+  Group, Visibility<[CC1Option]>,
+  HelpText<"Include also single conditions as MC/DC coverage">,
+  MarshallingInfoFlag>;
 def fprofile_generate : Flag<["-"], "fprofile-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw (overridden by LLVM_PROFILE_FILE env var)">;
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 9676e61cf322d9..82a31cb3721473 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -196,20 +196,36 @@ RawAddress 
CodeGenFunction::CreateMemTempWithoutCast(QualType Ty,
 /// EvaluateExprAsBool - Perform the usual unary conversions on the specified
 /// expression and compare the result against zero, returning an Int1Ty value.
 llvm::Value *CodeGenFunction::EvaluateExprAsBool(const Expr *E) {
+  auto DecisionExpr = stripCond(E);
+  if (isMCDCDecisionExpr(DecisionExpr) && 
isInstrumentedCondition(DecisionExpr))
+maybeResetMCDCCondBitmap(DecisionExpr);
+  else
+DecisionExpr = nullptr;
+
   PGO.setCurrentStmt(E);
+  llvm::Value *Result;
   if (const MemberPointerType *MPT = E->getType()->

[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-02-03 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

For reason I cannot add @rahulana-quic nor @tobiasgrosser as reviewers.

https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SPARC][IAS] Add support for `setsw` pseudoinstruction (PR #125150)

2025-02-03 Thread via llvm-branch-commits

https://github.com/koachan updated 
https://github.com/llvm/llvm-project/pull/125150

>From 259439304b31a8557db456d276a84849c7a37067 Mon Sep 17 00:00:00 2001
From: Koakuma 
Date: Mon, 3 Feb 2025 23:12:07 +0700
Subject: [PATCH] Incorporate feedback

Created using spr 1.3.4
---
 llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp 
b/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp
index 879f2ed8849618..3e9fc31d7bfc22 100644
--- a/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp
+++ b/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp
@@ -744,7 +744,7 @@ bool SparcAsmParser::expandSETSW(MCInst &Inst, SMLoc IDLoc,
   assert(MCRegOp.isReg());
   assert(MCValOp.isImm() || MCValOp.isExpr());
 
-  // the imm operand can be either an expression or an immediate.
+  // The imm operand can be either an expression or an immediate.
   bool IsImm = Inst.getOperand(1).isImm();
   int64_t ImmValue = IsImm ? MCValOp.getImm() : 0;
   const MCExpr *ValExpr = IsImm ? MCConstantExpr::create(ImmValue, 
getContext())
@@ -777,7 +777,7 @@ bool SparcAsmParser::expandSETSW(MCInst &Inst, SMLoc IDLoc,
 IsSmallImm ? ValExpr
: adjustPICRelocation(SparcMCExpr::VK_Sparc_LO, ValExpr);
 
-// orrd, %lo(val), rd
+// or rd, %lo(val), rd
 Instructions.push_back(MCInstBuilder(SP::ORri)
.addReg(MCRegOp.getReg())
.addReg(PrevReg.getReg())
@@ -790,7 +790,7 @@ bool SparcAsmParser::expandSETSW(MCInst &Inst, SMLoc IDLoc,
 
   // Large negative or non-immediate expressions would need an sra.
   if (!IsImm || ImmValue < 0) {
-// srard, %g0, rd
+// sra rd, %g0, rd
 Instructions.push_back(MCInstBuilder(SP::SRArr)
.addReg(MCRegOp.getReg())
.addReg(MCRegOp.getReg())

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)

2025-02-03 Thread Justin Bogner via llvm-branch-commits


@@ -0,0 +1,157 @@
+//===- DXILRootSignature.cpp - DXIL Root Signature helper objects ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+///
+/// \file This file contains helper objects and APIs for working with DXIL
+///   Root Signatures.
+///
+//===--===//
+#include "DXILRootSignature.h"
+#include "DirectX.h"
+#include "llvm/ADT/StringSwitch.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Module.h"
+#include 
+
+using namespace llvm;
+using namespace llvm::dxil;
+
+static bool reportError(Twine Message) {
+  report_fatal_error(Message, false);

bogner wrote:

I haven't had time to review this in detail yet, but one important note. We 
should not be using `report_fatal_error` for error handling here. This is 
essentially crashing the compiler and should be used *very* sparingly. If the 
errors can come from user input or from corrupt binary files, this type of 
error reporting will be a terrible user experience.

I suspect it would be more appropriate to use  `LLVMContext::diagnose` and the 
DiagnosticInfo machinery so that we can report issues back to the frontend.

https://github.com/llvm/llvm-project/pull/123147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >