[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/125585 Backport d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87 Requested by: @tstellar >From adf607aa5622a6e3a83a4016bc87f2c8321c47c7 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Mon, 3 Feb 2025 13:13:11 -0800 Subject: [PATCH] workflows/release-tasks: Re-use release-binaries-all workflow (#125378) This way we don't need to duplicate the list of supported targets in the release-tasks workflow. (cherry picked from commit d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87) --- .github/workflows/release-tasks.yml | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/.github/workflows/release-tasks.yml b/.github/workflows/release-tasks.yml index 780dd0ff6325c9..52076ea1821b0b 100644 --- a/.github/workflows/release-tasks.yml +++ b/.github/workflows/release-tasks.yml @@ -89,20 +89,10 @@ jobs: needs: - validate-tag - release-create -strategy: - fail-fast: false - matrix: -runs-on: - - ubuntu-22.04 - - windows-2022 - - macos-13 - - macos-14 - -uses: ./.github/workflows/release-binaries.yml +uses: ./.github/workflows/release-binaries-all.yml with: release-version: ${{ needs.validate-tag.outputs.release-version }} upload: true - runs-on: ${{ matrix.runs-on }} # Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. secrets: RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: None (llvmbot) Changes Backport d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87 Requested by: @tstellar --- Full diff: https://github.com/llvm/llvm-project/pull/125585.diff 1 Files Affected: - (modified) .github/workflows/release-tasks.yml (+1-11) ``diff diff --git a/.github/workflows/release-tasks.yml b/.github/workflows/release-tasks.yml index 780dd0ff6325c9..52076ea1821b0b 100644 --- a/.github/workflows/release-tasks.yml +++ b/.github/workflows/release-tasks.yml @@ -89,20 +89,10 @@ jobs: needs: - validate-tag - release-create -strategy: - fail-fast: false - matrix: -runs-on: - - ubuntu-22.04 - - windows-2022 - - macos-13 - - macos-14 - -uses: ./.github/workflows/release-binaries.yml +uses: ./.github/workflows/release-binaries-all.yml with: release-version: ${{ needs.validate-tag.outputs.release-version }} upload: true - runs-on: ${{ matrix.runs-on }} # Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. secrets: RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} `` https://github.com/llvm/llvm-project/pull/125585 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/125585 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
https://github.com/qcolombet approved this pull request. https://github.com/llvm/llvm-project/pull/125535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)
@@ -1002,17 +1003,15 @@ bool PeepholeOptimizer::optimizeCondBranch(MachineInstr &MI) { /// share the same register file as \p Reg and \p SubReg. The client should /// then be capable to rewrite all intermediate PHIs to get the next source. /// \return False if no alternative sources are available. True otherwise. -bool PeepholeOptimizer::findNextSource(RegSubRegPair RegSubReg, qcolombet wrote: Could you update the comment with the documentation for the additional parameters. https://github.com/llvm/llvm-project/pull/125533 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Avoid stack usage in bitcast operand promotion to legal vector (PR #125637)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/125637?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#125637** https://app.graphite.dev/github/pr/llvm/llvm-project/125637?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/125637?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#125636** https://app.graphite.dev/github/pr/llvm/llvm-project/125636?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/125637 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Avoid stack usage in bitcast operand promotion to legal vector (PR #125637)
llvmbot wrote: @llvm/pr-subscribers-llvm-selectiondag Author: Matt Arsenault (arsenm) Changes Fix introducing stack usage if a bitcast source operand is an illegal integer type cast to a legal vector type. This should cover more situations, but this is the first one I noticed. --- Patch is 156.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125637.diff 12 Files Affected: - (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (+34-1) - (modified) llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll (-160) - (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointers-contents-legalization.ll (-9) - (modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+54-274) - (modified) llvm/test/CodeGen/AMDGPU/kernel-args.ll (+122-611) - (modified) llvm/test/CodeGen/AMDGPU/load-constant-i16.ll (+17-23) - (modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+195-1105) - (modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+34-45) - (modified) llvm/test/CodeGen/AMDGPU/load-global-i8.ll (+16-32) - (modified) llvm/test/CodeGen/AMDGPU/min.ll (+75-231) - (modified) llvm/test/CodeGen/AMDGPU/shl.ll (+13-46) - (modified) llvm/test/CodeGen/AMDGPU/sra.ll (+14-53) ``diff diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp index 95fb8b406e51bf..eb0c5faa7fe1eb 100644 --- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp @@ -2202,9 +2202,42 @@ SDValue DAGTypeLegalizer::PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N) { } SDValue DAGTypeLegalizer::PromoteIntOp_BITCAST(SDNode *N) { + EVT OutVT = N->getValueType(0); + SDValue InOp = N->getOperand(0); + EVT InVT = InOp.getValueType(); + EVT NInVT = TLI.getTypeToTransformTo(*DAG.getContext(), InVT); + SDLoc dl(N); + + switch (getTypeAction(InVT)) { + case TargetLowering::TypePromoteInteger: { +if (OutVT.isVector()) { + EVT EltVT = OutVT.getVectorElementType(); + TypeSize EltSize = EltVT.getSizeInBits(); + TypeSize NInSize = NInVT.getSizeInBits(); + + if (NInSize.hasKnownScalarFactor(EltSize)) { +unsigned NumEltsWithPadding = NInSize.getKnownScalarFactor(EltSize); +EVT WideVecVT = +EVT::getVectorVT(*DAG.getContext(), EltVT, NumEltsWithPadding); + +if (isTypeLegal(WideVecVT)) { + SDValue Promoted = GetPromotedInteger(InOp); + SDValue Cast = DAG.getNode(ISD::BITCAST, dl, WideVecVT, Promoted); + return DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, OutVT, Cast, + DAG.getVectorIdxConstant(0, dl)); +} + } +} + +break; + } + default: +break; + } + // This should only occur in unusual situations like bitcasting to an // x86_fp80, so just turn it into a store+load - return CreateStackStoreLoad(N->getOperand(0), N->getValueType(0)); + return CreateStackStoreLoad(InOp, OutVT); } SDValue DAGTypeLegalizer::PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo) { diff --git a/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll b/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll index ab89bb293f6e6e..2c6aabec763306 100644 --- a/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll +++ b/llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll @@ -80,15 +80,6 @@ define <5 x i32> @bitcast_i160_to_v5i32(i160 %int) { ; GFX9-LABEL: bitcast_i160_to_v5i32: ; GFX9: ; %bb.0: ; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX9-NEXT:s_mov_b32 s4, s33 -; GFX9-NEXT:s_add_i32 s33, s32, 0x7c0 -; GFX9-NEXT:s_and_b32 s33, s33, 0xf800 -; GFX9-NEXT:s_mov_b32 s5, s34 -; GFX9-NEXT:s_mov_b32 s34, s32 -; GFX9-NEXT:s_addk_i32 s32, 0x1000 -; GFX9-NEXT:s_mov_b32 s32, s34 -; GFX9-NEXT:s_mov_b32 s34, s5 -; GFX9-NEXT:s_mov_b32 s33, s4 ; GFX9-NEXT:s_setpc_b64 s[30:31] ; ; GFX12-LABEL: bitcast_i160_to_v5i32: @@ -98,23 +89,6 @@ define <5 x i32> @bitcast_i160_to_v5i32(i160 %int) { ; GFX12-NEXT:s_wait_samplecnt 0x0 ; GFX12-NEXT:s_wait_bvhcnt 0x0 ; GFX12-NEXT:s_wait_kmcnt 0x0 -; GFX12-NEXT:s_mov_b32 s0, s33 -; GFX12-NEXT:s_add_co_i32 s33, s32, 31 -; GFX12-NEXT:s_mov_b32 s1, s34 -; GFX12-NEXT:s_wait_alu 0xfffe -; GFX12-NEXT:s_and_not1_b32 s33, s33, 31 -; GFX12-NEXT:s_clause 0x1 -; GFX12-NEXT:scratch_store_b64 off, v[2:3], s33 offset:8 -; GFX12-NEXT:scratch_store_b64 off, v[0:1], s33 -; GFX12-NEXT:scratch_load_b128 v[0:3], off, s33 -; GFX12-NEXT:s_mov_b32 s34, s32 -; GFX12-NEXT:s_add_co_i32 s32, s32, 64 -; GFX12-NEXT:s_wait_alu 0xfffe -; GFX12-NEXT:s_mov_b32 s32, s34 -; GFX12-NEXT:s_mov_b32 s34, s1 -; GFX12-NEXT:s_mov_b32 s33, s0 -; GFX12-NEXT:s_wait_loadcnt 0x0 -; GFX12-NEXT:s_wait_alu 0xfffe ; GFX12-NEXT:s_setpc_b64 s[30:31] %bitcast = bitcast i160 %int to <5 x i32> ret <5 x i32> %bitcast @@ -124,15 +98,6 @@ define <6 x i3
[llvm-branch-commits] [llvm] DAG: Avoid stack usage in bitcast operand promotion to legal vector (PR #125637)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/125637 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)
https://github.com/labrinea created https://github.com/llvm/llvm-project/pull/125525 None >From 1e9a503b62b690e4615979e1363d17dd3adffca4 Mon Sep 17 00:00:00 2001 From: Alexandros Lamprineas Date: Mon, 3 Feb 2025 15:57:41 + Subject: [PATCH] [FMV][AArch64] Release notes for LLVM20. --- clang/docs/ReleaseNotes.rst | 7 +++ llvm/docs/ReleaseNotes.md | 14 ++ 2 files changed, 21 insertions(+) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 53534d821b2c9a..b23963c8e611a1 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -654,6 +654,10 @@ Attribute Changes in Clang - The ``target_version`` attribute is now only supported for AArch64 and RISC-V architectures. +- When targeting AArch64, a function declaration annotated with ``target_version("default")`` + now generates a mangled default version of the function, whereas before at least one more + version other than the default was required to trigger Function Multi Versioning. + - Clang now permits the usage of the placement new operator in ``[[msvc::constexpr]]`` context outside of the std namespace. (#GH74924) @@ -1188,6 +1192,9 @@ Arm and AArch64 Support * FUJITSU-MONAKA (fujitsu-monaka) +- Runtime detection of depended-on Function Multi Versioning features has been added + in accordance with the Arm C Language Extensions (ACLE). + Android Support ^^^ diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index e0acb8f48c5b94..db9a681ebe2bc5 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -130,6 +130,10 @@ Changes to building LLVM Changes to TableGen --- +* The ARMTargetDefEmitter now binds Funtion Multi Versioning features to the + corresponding AArch64 Architecture Extensions such that their dependencies + can be autogenerated using TableGen. + Changes to Interprocedural Optimizations @@ -431,9 +435,19 @@ Changes to the C API Changes to the CodeGen infrastructure - +* GlobalOpt can now statically resolve calls to multi-versioned functions when targeting AArch64. + These calls would otherwise be routed through an IFunc resolver function. This optimization + can be applied when the caller is either a multi-versioned function itself, or it is compiled + with a sufficiently high set of architecture features (including the `target` attribute, and + command line options). + Changes to the Metadata Info - +* Multi-versioned functions targeting AArch64 are annotated with new metadata named `fmv-features`. + The metadata string value consists of a comma-separated list of Function Multi Versioning feature + names as defined in the Arm C Language Extensions (ACLE). + Changes to the Debug Info - ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)
https://github.com/labrinea milestoned https://github.com/llvm/llvm-project/pull/125525 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Alexandros Lamprineas (labrinea) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/125525.diff 2 Files Affected: - (modified) clang/docs/ReleaseNotes.rst (+7) - (modified) llvm/docs/ReleaseNotes.md (+14) ``diff diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 53534d821b2c9a9..b23963c8e611a1a 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -654,6 +654,10 @@ Attribute Changes in Clang - The ``target_version`` attribute is now only supported for AArch64 and RISC-V architectures. +- When targeting AArch64, a function declaration annotated with ``target_version("default")`` + now generates a mangled default version of the function, whereas before at least one more + version other than the default was required to trigger Function Multi Versioning. + - Clang now permits the usage of the placement new operator in ``[[msvc::constexpr]]`` context outside of the std namespace. (#GH74924) @@ -1188,6 +1192,9 @@ Arm and AArch64 Support * FUJITSU-MONAKA (fujitsu-monaka) +- Runtime detection of depended-on Function Multi Versioning features has been added + in accordance with the Arm C Language Extensions (ACLE). + Android Support ^^^ diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index e0acb8f48c5b940..db9a681ebe2bc57 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -130,6 +130,10 @@ Changes to building LLVM Changes to TableGen --- +* The ARMTargetDefEmitter now binds Funtion Multi Versioning features to the + corresponding AArch64 Architecture Extensions such that their dependencies + can be autogenerated using TableGen. + Changes to Interprocedural Optimizations @@ -431,9 +435,19 @@ Changes to the C API Changes to the CodeGen infrastructure - +* GlobalOpt can now statically resolve calls to multi-versioned functions when targeting AArch64. + These calls would otherwise be routed through an IFunc resolver function. This optimization + can be applied when the caller is either a multi-versioned function itself, or it is compiled + with a sufficiently high set of architecture features (including the `target` attribute, and + command line options). + Changes to the Metadata Info - +* Multi-versioned functions targeting AArch64 are annotated with new metadata named `fmv-features`. + The metadata string value consists of a comma-separated list of Function Multi Versioning feature + names as defined in the Arm C Language Extensions (ACLE). + Changes to the Debug Info - `` https://github.com/llvm/llvm-project/pull/125525 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/125535 >From e7b88d2c349059c01ddf463bf014a0c66d7c3b7e Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Thu, 23 Jan 2025 14:39:10 +0700 Subject: [PATCH] AMDGPU: Use default shouldRewriteCopySrc This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. --- llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | 24 - llvm/lib/Target/AMDGPU/SIRegisterInfo.h | 5 - llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll | 44 +- llvm/test/CodeGen/AMDGPU/ctpop64.ll | 36 +- llvm/test/CodeGen/AMDGPU/idot2.ll | 182 +++ llvm/test/CodeGen/AMDGPU/load-global-i32.ll | 85 ++-- .../peephole-opt-fold-reg-sequence-subreg.mir | 8 +- .../AMDGPU/peephole-opt-regseq-removal.mir| 4 +- .../CodeGen/AMDGPU/spill-scavenge-offset.ll | 476 +- 9 files changed, 418 insertions(+), 446 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 6fc57dec6a8264..71c720ed09b5fb 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned OpType) const { OpType <= AMDGPU::OPERAND_SRC_LAST; } -bool SIRegisterInfo::shouldRewriteCopySrc( - const TargetRegisterClass *DefRC, - unsigned DefSubReg, - const TargetRegisterClass *SrcRC, - unsigned SrcSubReg) const { - // We want to prefer the smallest register class possible, so we don't want to - // stop and rewrite on anything that looks like a subregister - // extract. Operations mostly don't care about the super register class, so we - // only want to stop on the most basic of copies between the same register - // class. - // - // e.g. if we have something like - // %0 = ... - // %1 = ... - // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2 - // %3 = COPY %2, sub0 - // - // We want to look through the COPY to find: - // => %3 = COPY %0 - - // Plain copy. - return getCommonSubClass(DefRC, SrcRC) != nullptr; -} - bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const { // TODO: 64-bit operands have extending behavior from 32-bit literal. return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST && diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h index 8e481e3ac23043..a434efb70d0525 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h @@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo { const TargetRegisterClass *SubRC, unsigned SubIdx) const; - bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC, -unsigned DefSubReg, -const TargetRegisterClass *SrcRC, -unsigned SrcSubReg) const override; - /// \returns True if operands defined with this operand type can accept /// a literal constant (i.e. any 32-bit immediate). bool opCanUseLiteralConstant(unsigned OpType) const; diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll index c6c0b9cf8f027f..cc2f775ff22bc5 100644 --- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll +++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll @@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr addrspace(1) %out0, ptr addrspa define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr addrspace(1) %in) nounwind { ; SI-LABEL: test_copy_v4i8_x4: ; SI: ; %bb.0: -; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11 -; SI-NEXT:s_mov_b32 s3, 0xf000 -; SI-NEXT:s_mov_b32 s10, 0 -; SI-NEXT:s_mov_b32 s11, s3 +; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11 +; SI-NEXT:s_mov_b32 s11, 0xf000 +; SI-NEXT:s_mov_b32 s2, 0 +; SI-NEXT:s_mov_b32 s3, s11 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; SI-NEXT:v_mov_b32_e32 v1, 0 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64 -; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9 -; SI-NEXT:s_mov_b32 s2, -1 -; SI-NEXT:s_mov_b32 s14, s2 -; SI-NEXT:s_mov_b32 s15, s3 -; SI-NEXT:s_mov_b32 s18, s2 +; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64 +; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9 +; SI-NEXT:s_mov_b32 s10, -1 +; SI-NEXT:s_mov_b32 s14, s10 +; SI-NEXT:s_mov_b32 s15, s11 +; SI-NEXT:s_mov_b32 s18, s10 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:s_mov_b32 s0, s4 -; SI-NEXT:s_mov_b32 s1, s5 -; SI-NEXT:s_mov_b32 s19, s3 -; SI-NEXT:s_mov_b32 s22, s2 -; SI-NEXT:s_mov_b32 s23, s3 -; SI-NEXT:s_mov_b32 s12
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/125535 >From e7b88d2c349059c01ddf463bf014a0c66d7c3b7e Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Thu, 23 Jan 2025 14:39:10 +0700 Subject: [PATCH] AMDGPU: Use default shouldRewriteCopySrc This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. --- llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | 24 - llvm/lib/Target/AMDGPU/SIRegisterInfo.h | 5 - llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll | 44 +- llvm/test/CodeGen/AMDGPU/ctpop64.ll | 36 +- llvm/test/CodeGen/AMDGPU/idot2.ll | 182 +++ llvm/test/CodeGen/AMDGPU/load-global-i32.ll | 85 ++-- .../peephole-opt-fold-reg-sequence-subreg.mir | 8 +- .../AMDGPU/peephole-opt-regseq-removal.mir| 4 +- .../CodeGen/AMDGPU/spill-scavenge-offset.ll | 476 +- 9 files changed, 418 insertions(+), 446 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 6fc57dec6a8264..71c720ed09b5fb 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned OpType) const { OpType <= AMDGPU::OPERAND_SRC_LAST; } -bool SIRegisterInfo::shouldRewriteCopySrc( - const TargetRegisterClass *DefRC, - unsigned DefSubReg, - const TargetRegisterClass *SrcRC, - unsigned SrcSubReg) const { - // We want to prefer the smallest register class possible, so we don't want to - // stop and rewrite on anything that looks like a subregister - // extract. Operations mostly don't care about the super register class, so we - // only want to stop on the most basic of copies between the same register - // class. - // - // e.g. if we have something like - // %0 = ... - // %1 = ... - // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2 - // %3 = COPY %2, sub0 - // - // We want to look through the COPY to find: - // => %3 = COPY %0 - - // Plain copy. - return getCommonSubClass(DefRC, SrcRC) != nullptr; -} - bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const { // TODO: 64-bit operands have extending behavior from 32-bit literal. return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST && diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h index 8e481e3ac23043..a434efb70d0525 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h @@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo { const TargetRegisterClass *SubRC, unsigned SubIdx) const; - bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC, -unsigned DefSubReg, -const TargetRegisterClass *SrcRC, -unsigned SrcSubReg) const override; - /// \returns True if operands defined with this operand type can accept /// a literal constant (i.e. any 32-bit immediate). bool opCanUseLiteralConstant(unsigned OpType) const; diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll index c6c0b9cf8f027f..cc2f775ff22bc5 100644 --- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll +++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll @@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr addrspace(1) %out0, ptr addrspa define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr addrspace(1) %in) nounwind { ; SI-LABEL: test_copy_v4i8_x4: ; SI: ; %bb.0: -; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11 -; SI-NEXT:s_mov_b32 s3, 0xf000 -; SI-NEXT:s_mov_b32 s10, 0 -; SI-NEXT:s_mov_b32 s11, s3 +; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11 +; SI-NEXT:s_mov_b32 s11, 0xf000 +; SI-NEXT:s_mov_b32 s2, 0 +; SI-NEXT:s_mov_b32 s3, s11 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; SI-NEXT:v_mov_b32_e32 v1, 0 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64 -; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9 -; SI-NEXT:s_mov_b32 s2, -1 -; SI-NEXT:s_mov_b32 s14, s2 -; SI-NEXT:s_mov_b32 s15, s3 -; SI-NEXT:s_mov_b32 s18, s2 +; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64 +; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9 +; SI-NEXT:s_mov_b32 s10, -1 +; SI-NEXT:s_mov_b32 s14, s10 +; SI-NEXT:s_mov_b32 s15, s11 +; SI-NEXT:s_mov_b32 s18, s10 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:s_mov_b32 s0, s4 -; SI-NEXT:s_mov_b32 s1, s5 -; SI-NEXT:s_mov_b32 s19, s3 -; SI-NEXT:s_mov_b32 s22, s2 -; SI-NEXT:s_mov_b32 s23, s3 -; SI-NEXT:s_mov_b32 s12
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/123147 >From 635b27a0842aa38d6a1c731bee72de0b547b7638 Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Wed, 15 Jan 2025 17:30:00 + Subject: [PATCH 01/16] adding metadata extraction --- .../llvm/Analysis/DXILMetadataAnalysis.h | 3 + llvm/lib/Analysis/DXILMetadataAnalysis.cpp| 89 +++ .../lib/Target/DirectX/DXContainerGlobals.cpp | 24 + 3 files changed, 116 insertions(+) diff --git a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h index cb535ac14f1c613..f420244ba111a45 100644 --- a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h +++ b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h @@ -11,9 +11,11 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/IR/PassManager.h" +#include "llvm/MC/DXContainerRootSignature.h" #include "llvm/Pass.h" #include "llvm/Support/VersionTuple.h" #include "llvm/TargetParser/Triple.h" +#include namespace llvm { @@ -37,6 +39,7 @@ struct ModuleMetadataInfo { Triple::EnvironmentType ShaderProfile{Triple::UnknownEnvironment}; VersionTuple ValidatorVersion{}; SmallVector EntryPropertyVec{}; + std::optional RootSignatureDesc; void print(raw_ostream &OS) const; }; diff --git a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp index a7f666a3f8b48f2..388e3853008eaec 100644 --- a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp +++ b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp @@ -15,12 +15,91 @@ #include "llvm/IR/Metadata.h" #include "llvm/IR/Module.h" #include "llvm/InitializePasses.h" +#include "llvm/MC/DXContainerRootSignature.h" +#include "llvm/Support/Casting.h" #include "llvm/Support/ErrorHandling.h" +#include #define DEBUG_TYPE "dxil-metadata-analysis" using namespace llvm; using namespace dxil; +using namespace llvm::mcdxbc; + +static bool parseRootFlags(MDNode *RootFlagNode, RootSignatureDesc *Desc) { + + assert(RootFlagNode->getNumOperands() == 2 && + "Invalid format for RootFlag Element"); + auto *Flag = mdconst::extract(RootFlagNode->getOperand(1)); + auto Value = (RootSignatureFlags)Flag->getZExtValue(); + + if ((Value & ~RootSignatureFlags::ValidFlags) != RootSignatureFlags::None) +return true; + + Desc->Flags = Value; + return false; +} + +static bool parseRootSignatureElement(MDNode *Element, + RootSignatureDesc *Desc) { + MDString *ElementText = cast(Element->getOperand(0)); + + assert(ElementText != nullptr && "First preoperty of element is not "); + + RootSignatureElementKind ElementKind = + StringSwitch(ElementText->getString()) + .Case("RootFlags", RootSignatureElementKind::RootFlags) + .Case("RootConstants", RootSignatureElementKind::RootConstants) + .Case("RootCBV", RootSignatureElementKind::RootDescriptor) + .Case("RootSRV", RootSignatureElementKind::RootDescriptor) + .Case("RootUAV", RootSignatureElementKind::RootDescriptor) + .Case("Sampler", RootSignatureElementKind::RootDescriptor) + .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable) + .Case("StaticSampler", RootSignatureElementKind::StaticSampler) + .Default(RootSignatureElementKind::None); + + switch (ElementKind) { + + case RootSignatureElementKind::RootFlags: { +return parseRootFlags(Element, Desc); +break; + } + + case RootSignatureElementKind::RootConstants: + case RootSignatureElementKind::RootDescriptor: + case RootSignatureElementKind::DescriptorTable: + case RootSignatureElementKind::StaticSampler: + case RootSignatureElementKind::None: +llvm_unreachable("Not Implemented yet"); +break; + } + + return true; +} + +bool parseRootSignature(RootSignatureDesc *Desc, int32_t Version, +NamedMDNode *Root) { + Desc->Version = Version; + bool HasError = false; + + for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) { +// This should be an if, for error handling +MDNode *Node = cast(Root->getOperand(Sid)); + +// Not sure what use this for... +Metadata *Func = Node->getOperand(0).get(); + +// This should be an if, for error handling +MDNode *Elements = cast(Node->getOperand(1).get()); + +for (unsigned int Eid = 0; Eid < Elements->getNumOperands(); Eid++) { + MDNode *Element = cast(Elements->getOperand(Eid)); + + HasError = HasError || parseRootSignatureElement(Element, Desc); +} + } + return HasError; +} static ModuleMetadataInfo collectMetadataInfo(Module &M) { ModuleMetadataInfo MMDAI; @@ -28,6 +107,7 @@ static ModuleMetadataInfo collectMetadataInfo(Module &M) { MMDAI.DXILVersion = TT.getDXILVersion(); MMDAI.ShaderModelVersion = TT.getOSVersion(); MMDAI.ShaderProfile = TT.getEnvironment(); + NamedMDNode *ValidatorVerNode = M.getNamedMetadata("dx.valver"); if (ValidatorVerNode) {
[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/125588 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)
github-actions[bot] wrote: @tstellar (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/125588 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/125585 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
https://github.com/llvmbot closed https://github.com/llvm/llvm-project/pull/125585 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/123147 >From 635b27a0842aa38d6a1c731bee72de0b547b7638 Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Wed, 15 Jan 2025 17:30:00 + Subject: [PATCH 01/17] adding metadata extraction --- .../llvm/Analysis/DXILMetadataAnalysis.h | 3 + llvm/lib/Analysis/DXILMetadataAnalysis.cpp| 89 +++ .../lib/Target/DirectX/DXContainerGlobals.cpp | 24 + 3 files changed, 116 insertions(+) diff --git a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h index cb535ac14f1c61..f420244ba111a4 100644 --- a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h +++ b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h @@ -11,9 +11,11 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/IR/PassManager.h" +#include "llvm/MC/DXContainerRootSignature.h" #include "llvm/Pass.h" #include "llvm/Support/VersionTuple.h" #include "llvm/TargetParser/Triple.h" +#include namespace llvm { @@ -37,6 +39,7 @@ struct ModuleMetadataInfo { Triple::EnvironmentType ShaderProfile{Triple::UnknownEnvironment}; VersionTuple ValidatorVersion{}; SmallVector EntryPropertyVec{}; + std::optional RootSignatureDesc; void print(raw_ostream &OS) const; }; diff --git a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp index a7f666a3f8b48f..388e3853008eae 100644 --- a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp +++ b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp @@ -15,12 +15,91 @@ #include "llvm/IR/Metadata.h" #include "llvm/IR/Module.h" #include "llvm/InitializePasses.h" +#include "llvm/MC/DXContainerRootSignature.h" +#include "llvm/Support/Casting.h" #include "llvm/Support/ErrorHandling.h" +#include #define DEBUG_TYPE "dxil-metadata-analysis" using namespace llvm; using namespace dxil; +using namespace llvm::mcdxbc; + +static bool parseRootFlags(MDNode *RootFlagNode, RootSignatureDesc *Desc) { + + assert(RootFlagNode->getNumOperands() == 2 && + "Invalid format for RootFlag Element"); + auto *Flag = mdconst::extract(RootFlagNode->getOperand(1)); + auto Value = (RootSignatureFlags)Flag->getZExtValue(); + + if ((Value & ~RootSignatureFlags::ValidFlags) != RootSignatureFlags::None) +return true; + + Desc->Flags = Value; + return false; +} + +static bool parseRootSignatureElement(MDNode *Element, + RootSignatureDesc *Desc) { + MDString *ElementText = cast(Element->getOperand(0)); + + assert(ElementText != nullptr && "First preoperty of element is not "); + + RootSignatureElementKind ElementKind = + StringSwitch(ElementText->getString()) + .Case("RootFlags", RootSignatureElementKind::RootFlags) + .Case("RootConstants", RootSignatureElementKind::RootConstants) + .Case("RootCBV", RootSignatureElementKind::RootDescriptor) + .Case("RootSRV", RootSignatureElementKind::RootDescriptor) + .Case("RootUAV", RootSignatureElementKind::RootDescriptor) + .Case("Sampler", RootSignatureElementKind::RootDescriptor) + .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable) + .Case("StaticSampler", RootSignatureElementKind::StaticSampler) + .Default(RootSignatureElementKind::None); + + switch (ElementKind) { + + case RootSignatureElementKind::RootFlags: { +return parseRootFlags(Element, Desc); +break; + } + + case RootSignatureElementKind::RootConstants: + case RootSignatureElementKind::RootDescriptor: + case RootSignatureElementKind::DescriptorTable: + case RootSignatureElementKind::StaticSampler: + case RootSignatureElementKind::None: +llvm_unreachable("Not Implemented yet"); +break; + } + + return true; +} + +bool parseRootSignature(RootSignatureDesc *Desc, int32_t Version, +NamedMDNode *Root) { + Desc->Version = Version; + bool HasError = false; + + for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) { +// This should be an if, for error handling +MDNode *Node = cast(Root->getOperand(Sid)); + +// Not sure what use this for... +Metadata *Func = Node->getOperand(0).get(); + +// This should be an if, for error handling +MDNode *Elements = cast(Node->getOperand(1).get()); + +for (unsigned int Eid = 0; Eid < Elements->getNumOperands(); Eid++) { + MDNode *Element = cast(Elements->getOperand(Eid)); + + HasError = HasError || parseRootSignatureElement(Element, Desc); +} + } + return HasError; +} static ModuleMetadataInfo collectMetadataInfo(Module &M) { ModuleMetadataInfo MMDAI; @@ -28,6 +107,7 @@ static ModuleMetadataInfo collectMetadataInfo(Module &M) { MMDAI.DXILVersion = TT.getDXILVersion(); MMDAI.ShaderModelVersion = TT.getOSVersion(); MMDAI.ShaderProfile = TT.getEnvironment(); + NamedMDNode *ValidatorVerNode = M.getNamedMetadata("dx.valver"); if (ValidatorVerNode) {
[llvm-branch-commits] [clang] release/20.x: [AArch64] Enable vscale_range with +sme (#124466) (PR #125386)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/125386 >From d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b Mon Sep 17 00:00:00 2001 From: David Green Date: Fri, 31 Jan 2025 07:57:43 + Subject: [PATCH] [AArch64] Enable vscale_range with +sme (#124466) If we have +sme but not +sve, we would not set vscale_range on functions. It should be valid to apply it with the same range with just +sme, which can help mitigate some performance regressions in cases such as scalable vector bitcasts (https://godbolt.org/z/exhe4jd8d). (cherry picked from commit 9f1c825fb62319b94ac9604f733afd59e9eb461b) --- clang/include/clang/Basic/TargetInfo.h | 3 ++- clang/lib/AST/ASTContext.cpp| 3 ++- clang/lib/AST/ItaniumMangle.cpp | 2 +- clang/lib/Basic/Targets/AArch64.cpp | 5 +++-- clang/lib/Basic/Targets/AArch64.h | 3 ++- clang/lib/Basic/Targets/RISCV.cpp | 5 +++-- clang/lib/Basic/Targets/RISCV.h | 3 ++- clang/lib/CodeGen/CodeGenFunction.cpp | 17 + clang/lib/CodeGen/Targets/RISCV.cpp | 4 ++-- clang/lib/Sema/SemaType.cpp | 3 ++- .../sme-intrinsics/aarch64-sme-attrs.cpp| 4 ++-- 11 files changed, 30 insertions(+), 22 deletions(-) diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h index 43c09cf1f973e3c..d762144478b489d 100644 --- a/clang/include/clang/Basic/TargetInfo.h +++ b/clang/include/clang/Basic/TargetInfo.h @@ -1023,7 +1023,8 @@ class TargetInfo : public TransferrableTargetInfo, /// Returns target-specific min and max values VScale_Range. virtual std::optional> - getVScaleRange(const LangOptions &LangOpts) const { + getVScaleRange(const LangOptions &LangOpts, + bool IsArmStreamingFunction) const { return std::nullopt; } /// The __builtin_clz* and __builtin_ctz* built-in diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index cd1bcb3b9a063d8..e58091ce95f6258 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -10363,7 +10363,8 @@ bool ASTContext::areLaxCompatibleSveTypes(QualType FirstType, /// getRVVTypeSize - Return RVV vector register size. static uint64_t getRVVTypeSize(ASTContext &Context, const BuiltinType *Ty) { assert(Ty->isRVVVLSBuiltinType() && "Invalid RVV Type"); - auto VScale = Context.getTargetInfo().getVScaleRange(Context.getLangOpts()); + auto VScale = + Context.getTargetInfo().getVScaleRange(Context.getLangOpts(), false); if (!VScale) return 0; diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp index 49089c0ea3c8ac1..f84ccefd34cacbe 100644 --- a/clang/lib/AST/ItaniumMangle.cpp +++ b/clang/lib/AST/ItaniumMangle.cpp @@ -4198,7 +4198,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const VectorType *T) { // Apend the LMUL suffix. auto VScale = getASTContext().getTargetInfo().getVScaleRange( - getASTContext().getLangOpts()); + getASTContext().getLangOpts(), false); unsigned VLen = VScale->first * llvm::RISCV::RVVBitsPerBlock; if (T->getVectorKind() == VectorKind::RVVFixedLengthData) { diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp index 0b899137bbb5c74..57c9849ef2a7287 100644 --- a/clang/lib/Basic/Targets/AArch64.cpp +++ b/clang/lib/Basic/Targets/AArch64.cpp @@ -703,12 +703,13 @@ ArrayRef AArch64TargetInfo::getTargetBuiltins() const { } std::optional> -AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts) const { +AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts, + bool IsArmStreamingFunction) const { if (LangOpts.VScaleMin || LangOpts.VScaleMax) return std::pair( LangOpts.VScaleMin ? LangOpts.VScaleMin : 1, LangOpts.VScaleMax); - if (hasFeature("sve")) + if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme"))) return std::pair(1, 16); return std::nullopt; diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h index 600940f5e4e23c1..b75d2a9dc8ecadc 100644 --- a/clang/lib/Basic/Targets/AArch64.h +++ b/clang/lib/Basic/Targets/AArch64.h @@ -184,7 +184,8 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo { ArrayRef getTargetBuiltins() const override; std::optional> - getVScaleRange(const LangOptions &LangOpts) const override; + getVScaleRange(const LangOptions &LangOpts, + bool IsArmStreamingFunction) const override; bool doesFeatureAffectCodeGen(StringRef Name) const override; bool validateCpuSupports(StringRef FeatureStr) const override; bool hasFeature(StringRef Feature) const override; diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp index 8167d7603b0e143..61b8ae9d098abc0 100644 --- a/clang/lib/Basic/Targets/RISCV.cpp +++ b/cl
[llvm-branch-commits] [clang] d185bd9 - [AArch64] Enable vscale_range with +sme (#124466)
Author: David Green Date: 2025-02-03T17:32:53-08:00 New Revision: d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b URL: https://github.com/llvm/llvm-project/commit/d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b DIFF: https://github.com/llvm/llvm-project/commit/d185bd94ff7717429fd2fffbcd0d4c7c64c05f0b.diff LOG: [AArch64] Enable vscale_range with +sme (#124466) If we have +sme but not +sve, we would not set vscale_range on functions. It should be valid to apply it with the same range with just +sme, which can help mitigate some performance regressions in cases such as scalable vector bitcasts (https://godbolt.org/z/exhe4jd8d). (cherry picked from commit 9f1c825fb62319b94ac9604f733afd59e9eb461b) Added: Modified: clang/include/clang/Basic/TargetInfo.h clang/lib/AST/ASTContext.cpp clang/lib/AST/ItaniumMangle.cpp clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/RISCV.cpp clang/lib/Basic/Targets/RISCV.h clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/Targets/RISCV.cpp clang/lib/Sema/SemaType.cpp clang/test/CodeGen/AArch64/sme-intrinsics/aarch64-sme-attrs.cpp Removed: diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h index 43c09cf1f973e3..d762144478b489 100644 --- a/clang/include/clang/Basic/TargetInfo.h +++ b/clang/include/clang/Basic/TargetInfo.h @@ -1023,7 +1023,8 @@ class TargetInfo : public TransferrableTargetInfo, /// Returns target-specific min and max values VScale_Range. virtual std::optional> - getVScaleRange(const LangOptions &LangOpts) const { + getVScaleRange(const LangOptions &LangOpts, + bool IsArmStreamingFunction) const { return std::nullopt; } /// The __builtin_clz* and __builtin_ctz* built-in diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index cd1bcb3b9a063d..e58091ce95f625 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -10363,7 +10363,8 @@ bool ASTContext::areLaxCompatibleSveTypes(QualType FirstType, /// getRVVTypeSize - Return RVV vector register size. static uint64_t getRVVTypeSize(ASTContext &Context, const BuiltinType *Ty) { assert(Ty->isRVVVLSBuiltinType() && "Invalid RVV Type"); - auto VScale = Context.getTargetInfo().getVScaleRange(Context.getLangOpts()); + auto VScale = + Context.getTargetInfo().getVScaleRange(Context.getLangOpts(), false); if (!VScale) return 0; diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp index 49089c0ea3c8ac..f84ccefd34cacb 100644 --- a/clang/lib/AST/ItaniumMangle.cpp +++ b/clang/lib/AST/ItaniumMangle.cpp @@ -4198,7 +4198,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const VectorType *T) { // Apend the LMUL suffix. auto VScale = getASTContext().getTargetInfo().getVScaleRange( - getASTContext().getLangOpts()); + getASTContext().getLangOpts(), false); unsigned VLen = VScale->first * llvm::RISCV::RVVBitsPerBlock; if (T->getVectorKind() == VectorKind::RVVFixedLengthData) { diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp index 0b899137bbb5c7..57c9849ef2a728 100644 --- a/clang/lib/Basic/Targets/AArch64.cpp +++ b/clang/lib/Basic/Targets/AArch64.cpp @@ -703,12 +703,13 @@ ArrayRef AArch64TargetInfo::getTargetBuiltins() const { } std::optional> -AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts) const { +AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts, + bool IsArmStreamingFunction) const { if (LangOpts.VScaleMin || LangOpts.VScaleMax) return std::pair( LangOpts.VScaleMin ? LangOpts.VScaleMin : 1, LangOpts.VScaleMax); - if (hasFeature("sve")) + if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme"))) return std::pair(1, 16); return std::nullopt; diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h index 600940f5e4e23c..b75d2a9dc8ecad 100644 --- a/clang/lib/Basic/Targets/AArch64.h +++ b/clang/lib/Basic/Targets/AArch64.h @@ -184,7 +184,8 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo { ArrayRef getTargetBuiltins() const override; std::optional> - getVScaleRange(const LangOptions &LangOpts) const override; + getVScaleRange(const LangOptions &LangOpts, + bool IsArmStreamingFunction) const override; bool doesFeatureAffectCodeGen(StringRef Name) const override; bool validateCpuSupports(StringRef FeatureStr) const override; bool hasFeature(StringRef Feature) const override; diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp index 8167d7603b0e14..61b8ae9d098abc 100644 --- a/clang/lib/Basic/Targets/RISCV.cpp +++ b/clang/lib/Basic/Targets/RISCV.cpp @@ -222,7 +222,7 @@ void RISCVTargetInfo::g
[llvm-branch-commits] [clang] release/20.x: [AArch64] Enable vscale_range with +sme (#124466) (PR #125386)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/125386 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)
https://github.com/boomanaiden154 approved this pull request. https://github.com/llvm/llvm-project/pull/125588 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
https://github.com/boomanaiden154 approved this pull request. https://github.com/llvm/llvm-project/pull/125585 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: workflows/release-tasks: Re-use release-binaries-all workflow (#125378) (PR #125585)
llvmbot wrote: @boomanaiden154 What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/125585 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/123147 >From 635b27a0842aa38d6a1c731bee72de0b547b7638 Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Wed, 15 Jan 2025 17:30:00 + Subject: [PATCH 01/15] adding metadata extraction --- .../llvm/Analysis/DXILMetadataAnalysis.h | 3 + llvm/lib/Analysis/DXILMetadataAnalysis.cpp| 89 +++ .../lib/Target/DirectX/DXContainerGlobals.cpp | 24 + 3 files changed, 116 insertions(+) diff --git a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h index cb535ac14f1c61..f420244ba111a4 100644 --- a/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h +++ b/llvm/include/llvm/Analysis/DXILMetadataAnalysis.h @@ -11,9 +11,11 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/IR/PassManager.h" +#include "llvm/MC/DXContainerRootSignature.h" #include "llvm/Pass.h" #include "llvm/Support/VersionTuple.h" #include "llvm/TargetParser/Triple.h" +#include namespace llvm { @@ -37,6 +39,7 @@ struct ModuleMetadataInfo { Triple::EnvironmentType ShaderProfile{Triple::UnknownEnvironment}; VersionTuple ValidatorVersion{}; SmallVector EntryPropertyVec{}; + std::optional RootSignatureDesc; void print(raw_ostream &OS) const; }; diff --git a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp index a7f666a3f8b48f..388e3853008eae 100644 --- a/llvm/lib/Analysis/DXILMetadataAnalysis.cpp +++ b/llvm/lib/Analysis/DXILMetadataAnalysis.cpp @@ -15,12 +15,91 @@ #include "llvm/IR/Metadata.h" #include "llvm/IR/Module.h" #include "llvm/InitializePasses.h" +#include "llvm/MC/DXContainerRootSignature.h" +#include "llvm/Support/Casting.h" #include "llvm/Support/ErrorHandling.h" +#include #define DEBUG_TYPE "dxil-metadata-analysis" using namespace llvm; using namespace dxil; +using namespace llvm::mcdxbc; + +static bool parseRootFlags(MDNode *RootFlagNode, RootSignatureDesc *Desc) { + + assert(RootFlagNode->getNumOperands() == 2 && + "Invalid format for RootFlag Element"); + auto *Flag = mdconst::extract(RootFlagNode->getOperand(1)); + auto Value = (RootSignatureFlags)Flag->getZExtValue(); + + if ((Value & ~RootSignatureFlags::ValidFlags) != RootSignatureFlags::None) +return true; + + Desc->Flags = Value; + return false; +} + +static bool parseRootSignatureElement(MDNode *Element, + RootSignatureDesc *Desc) { + MDString *ElementText = cast(Element->getOperand(0)); + + assert(ElementText != nullptr && "First preoperty of element is not "); + + RootSignatureElementKind ElementKind = + StringSwitch(ElementText->getString()) + .Case("RootFlags", RootSignatureElementKind::RootFlags) + .Case("RootConstants", RootSignatureElementKind::RootConstants) + .Case("RootCBV", RootSignatureElementKind::RootDescriptor) + .Case("RootSRV", RootSignatureElementKind::RootDescriptor) + .Case("RootUAV", RootSignatureElementKind::RootDescriptor) + .Case("Sampler", RootSignatureElementKind::RootDescriptor) + .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable) + .Case("StaticSampler", RootSignatureElementKind::StaticSampler) + .Default(RootSignatureElementKind::None); + + switch (ElementKind) { + + case RootSignatureElementKind::RootFlags: { +return parseRootFlags(Element, Desc); +break; + } + + case RootSignatureElementKind::RootConstants: + case RootSignatureElementKind::RootDescriptor: + case RootSignatureElementKind::DescriptorTable: + case RootSignatureElementKind::StaticSampler: + case RootSignatureElementKind::None: +llvm_unreachable("Not Implemented yet"); +break; + } + + return true; +} + +bool parseRootSignature(RootSignatureDesc *Desc, int32_t Version, +NamedMDNode *Root) { + Desc->Version = Version; + bool HasError = false; + + for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) { +// This should be an if, for error handling +MDNode *Node = cast(Root->getOperand(Sid)); + +// Not sure what use this for... +Metadata *Func = Node->getOperand(0).get(); + +// This should be an if, for error handling +MDNode *Elements = cast(Node->getOperand(1).get()); + +for (unsigned int Eid = 0; Eid < Elements->getNumOperands(); Eid++) { + MDNode *Element = cast(Elements->getOperand(Eid)); + + HasError = HasError || parseRootSignatureElement(Element, Desc); +} + } + return HasError; +} static ModuleMetadataInfo collectMetadataInfo(Module &M) { ModuleMetadataInfo MMDAI; @@ -28,6 +107,7 @@ static ModuleMetadataInfo collectMetadataInfo(Module &M) { MMDAI.DXILVersion = TT.getDXILVersion(); MMDAI.ShaderModelVersion = TT.getOSVersion(); MMDAI.ShaderProfile = TT.getEnvironment(); + NamedMDNode *ValidatorVerNode = M.getNamedMetadata("dx.valver"); if (ValidatorVerNode) {
[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)
https://github.com/tstellar created https://github.com/llvm/llvm-project/pull/125588 (cherry picked from commit 2deba08e09b9412c9f4e5888237e28173dee085b) >From 34bae71660d86455c5a51ad00fec49129847bc1d Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Mon, 3 Feb 2025 13:20:37 -0800 Subject: [PATCH] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (cherry picked from commit 2deba08e09b9412c9f4e5888237e28173dee085b) --- .github/workflows/premerge.yaml | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/.github/workflows/premerge.yaml b/.github/workflows/premerge.yaml index c22b35e122b9fc..956760feaa3b52 100644 --- a/.github/workflows/premerge.yaml +++ b/.github/workflows/premerge.yaml @@ -5,6 +5,17 @@ permissions: on: pull_request: +types: + - opened + - synchronize + - reopened + # When a PR is closed, we still start this workflow, but then skip + # all the jobs, which makes it effectively a no-op. The reason to + # do this is that it allows us to take advantage of concurrency groups + # to cancel in progress CI jobs whenever the PR is closed. + - closed +paths: + - .github/workflows/premerge.yaml push: branches: - 'main' @@ -12,7 +23,9 @@ on: jobs: premerge-checks-linux: -if: false && github.repository_owner == 'llvm' +if: >- +false && github.repository_owner == 'llvm' && +(github.event_name != 'pull_request' || github.event.action != 'closed') runs-on: llvm-premerge-linux-runners concurrency: group: ${{ github.workflow }}-linux-${{ github.event.pull_request.number || github.sha }} @@ -71,7 +84,9 @@ jobs: ./.ci/monolithic-linux.sh "$(echo ${linux_projects} | tr ' ' ';')" "$(echo ${linux_check_targets})" "$(echo ${linux_runtimes} | tr ' ' ';')" "$(echo ${linux_runtime_check_targets})" premerge-checks-windows: -if: false && github.repository_owner == 'llvm' +if: >- +false && github.repository_owner == 'llvm' && +(github.event_name != 'pull_request' || github.event.action != 'closed') runs-on: llvm-premerge-windows-runners concurrency: group: ${{ github.workflow }}-windows-${{ github.event.pull_request.number || github.sha }} @@ -139,7 +154,8 @@ jobs: if: >- github.repository_owner == 'llvm' && (startswith(github.ref_name, 'release/') || - startswith(github.base_ref, 'release/')) + startswith(github.base_ref, 'release/')) && + (github.event_name != 'pull_request' || github.event.action != 'closed') steps: - name: Checkout LLVM uses: actions/checkout@v4 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)
https://github.com/tstellar milestoned https://github.com/llvm/llvm-project/pull/125588 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] workflows/premerge: Cancel in progress jobs when a PR is merged (#125329) (PR #125588)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: Tom Stellard (tstellar) Changes (cherry picked from commit 2deba08e09b9412c9f4e5888237e28173dee085b) --- Full diff: https://github.com/llvm/llvm-project/pull/125588.diff 1 Files Affected: - (modified) .github/workflows/premerge.yaml (+19-3) ``diff diff --git a/.github/workflows/premerge.yaml b/.github/workflows/premerge.yaml index c22b35e122b9fc..956760feaa3b52 100644 --- a/.github/workflows/premerge.yaml +++ b/.github/workflows/premerge.yaml @@ -5,6 +5,17 @@ permissions: on: pull_request: +types: + - opened + - synchronize + - reopened + # When a PR is closed, we still start this workflow, but then skip + # all the jobs, which makes it effectively a no-op. The reason to + # do this is that it allows us to take advantage of concurrency groups + # to cancel in progress CI jobs whenever the PR is closed. + - closed +paths: + - .github/workflows/premerge.yaml push: branches: - 'main' @@ -12,7 +23,9 @@ on: jobs: premerge-checks-linux: -if: false && github.repository_owner == 'llvm' +if: >- +false && github.repository_owner == 'llvm' && +(github.event_name != 'pull_request' || github.event.action != 'closed') runs-on: llvm-premerge-linux-runners concurrency: group: ${{ github.workflow }}-linux-${{ github.event.pull_request.number || github.sha }} @@ -71,7 +84,9 @@ jobs: ./.ci/monolithic-linux.sh "$(echo ${linux_projects} | tr ' ' ';')" "$(echo ${linux_check_targets})" "$(echo ${linux_runtimes} | tr ' ' ';')" "$(echo ${linux_runtime_check_targets})" premerge-checks-windows: -if: false && github.repository_owner == 'llvm' +if: >- +false && github.repository_owner == 'llvm' && +(github.event_name != 'pull_request' || github.event.action != 'closed') runs-on: llvm-premerge-windows-runners concurrency: group: ${{ github.workflow }}-windows-${{ github.event.pull_request.number || github.sha }} @@ -139,7 +154,8 @@ jobs: if: >- github.repository_owner == 'llvm' && (startswith(github.ref_name, 'release/') || - startswith(github.base_ref, 'release/')) + startswith(github.base_ref, 'release/')) && + (github.event_name != 'pull_request' || github.event.action != 'closed') steps: - name: Checkout LLVM uses: actions/checkout@v4 `` https://github.com/llvm/llvm-project/pull/125588 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
@@ -10,13 +10,13 @@ Header: PartOffsets: [ 60 ] Parts: - Name:RTS0 -Size:8 +Size:4 joaosaffran wrote: Wrong rebase, fixed it https://github.com/llvm/llvm-project/pull/123147 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
@@ -829,6 +829,7 @@ option (LLVM_ENABLE_DOXYGEN "Use doxygen to generate llvm API documentation." OF option (LLVM_ENABLE_SPHINX "Use Sphinx to generate llvm documentation." OFF) option (LLVM_ENABLE_OCAMLDOC "Build OCaml bindings documentation." ON) option (LLVM_ENABLE_BINDINGS "Build bindings." ON) +option (LLVM_BUILD_TELEMETRY "Build the telemtry library. This does not enable telemetry." ON) cmtice wrote: Typo: "telemetry" https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
@@ -0,0 +1,158 @@ +//===- DXILRootSignature.cpp - DXIL Root Signature helper objects ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +/// +/// \file This file contains helper objects and APIs for working with DXIL +/// Root Signatures. +/// +//===--===// +#include "DXILRootSignature.h" +#include "DirectX.h" +#include "llvm/ADT/StringSwitch.h" +#include "llvm/ADT/Twine.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/Module.h" +#include + +using namespace llvm; +using namespace llvm::dxil; + +static bool reportError(Twine Message) { + report_fatal_error(Message, false); + return true; +} + +static bool parseRootFlags(ModuleRootSignature *MRS, MDNode *RootFlagNode) { + + if (RootFlagNode->getNumOperands() != 2) +return reportError("Invalid format for RootFlag Element"); + + auto *Flag = mdconst::extract(RootFlagNode->getOperand(1)); + uint32_t Value = Flag->getZExtValue(); + + // Root Element validation, as specified: + // https://github.com/llvm/wg-hlsl/blob/main/proposals/0002-root-signature-in-clang.md#validations-during-dxil-generation + if ((Value & ~0x8fff) != 0) +return reportError("Invalid flag value for RootFlag"); + + MRS->Flags = Value; + return false; +} + +static bool parseRootSignatureElement(ModuleRootSignature *MRS, + MDNode *Element) { + MDString *ElementText = cast(Element->getOperand(0)); + if (ElementText == nullptr) +return reportError("Invalid format for Root Element"); + + RootSignatureElementKind ElementKind = + StringSwitch(ElementText->getString()) + .Case("RootFlags", RootSignatureElementKind::RootFlags) + .Case("RootConstants", RootSignatureElementKind::RootConstants) + .Case("RootCBV", RootSignatureElementKind::RootDescriptor) + .Case("RootSRV", RootSignatureElementKind::RootDescriptor) + .Case("RootUAV", RootSignatureElementKind::RootDescriptor) + .Case("Sampler", RootSignatureElementKind::RootDescriptor) + .Case("DescriptorTable", RootSignatureElementKind::DescriptorTable) + .Case("StaticSampler", RootSignatureElementKind::StaticSampler) + .Default(RootSignatureElementKind::None); + + switch (ElementKind) { + + case RootSignatureElementKind::RootFlags: { +return parseRootFlags(MRS, Element); +break; + } + + case RootSignatureElementKind::RootConstants: + case RootSignatureElementKind::RootDescriptor: + case RootSignatureElementKind::DescriptorTable: + case RootSignatureElementKind::StaticSampler: + case RootSignatureElementKind::None: +return reportError("Invalid Root Element: " + ElementText->getString()); +break; + } + + return true; +} + +bool ModuleRootSignature::parse(NamedMDNode *Root) { + bool HasError = false; + + /** Root Signature are specified as following in the metadata: + + !dx.rootsignatures = !{!2} ; list of function/root signature pairs + !2 = !{ ptr @main, !3 } ; function, root signature + !3 = !{ !4, !5, !6, !7 } ; list of root signature elements + + So for each MDNode inside dx.rootsignatures NamedMDNode + (the Root parameter of this function), the parsing process needs + to loop through each of it's operand and process the pairs function + signature pair. + */ + + for (unsigned int Sid = 0; Sid < Root->getNumOperands(); Sid++) { +MDNode *Node = dyn_cast(Root->getOperand(Sid)); + +if (Node == nullptr || Node->getNumOperands() != 2) + return reportError("Invalid format for Root Signature Definition. Pairs " + "of function, root signature expected."); + +// Get the Root Signature Description from the function signature pair. +MDNode *RS = dyn_cast(Node->getOperand(1).get()); + +if (RS == nullptr) + return reportError("Missing Root Signature Metadata node."); + +// Loop through the Root Elements of the root signature. +for (unsigned int Eid = 0; Eid < RS->getNumOperands(); Eid++) { + + MDNode *Element = dyn_cast(RS->getOperand(Eid)); + if (Element == nullptr) +return reportError("Missing Root Element Metadata Node."); + + HasError = HasError || parseRootSignatureElement(this, Element); +} + } + return HasError; +} + +ModuleRootSignature ModuleRootSignature::analyzeModule(Module &M) { + ModuleRootSignature MRS; + + NamedMDNode *RootSignatureNode = M.getNamedMetadata("dx.rootsignatures"); + if (RootSignatureNode) { +if (MRS.parse(RootSignatureNode)) + llvm_unreachable("Invalid Root Signature Metadata."); joaosaffran wrote: We can do that, but that would ignore the return value from
[llvm-branch-commits] [llvm] release/20.x: [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (PR #125624)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/125624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (PR #125624)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/125624 Backport fbe470c1b215e3f953a41db6b91d20ce0bcf5c4e Requested by: @brad0 >From 1b6946b60080e057d5848cea36ce801ddf2a43f6 Mon Sep 17 00:00:00 2001 From: Brad Smith Date: Mon, 3 Feb 2025 22:43:43 -0500 Subject: [PATCH] [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (cherry picked from commit c24774dc4f4402c3ad150363321cc972ed2669e7) (cherry picked from commit fbe470c1b215e3f953a41db6b91d20ce0bcf5c4e) --- third-party/benchmark/src/sysinfo.cc | 53 ++-- 1 file changed, 3 insertions(+), 50 deletions(-) diff --git a/third-party/benchmark/src/sysinfo.cc b/third-party/benchmark/src/sysinfo.cc index 2bed1663af2e95..8283a081ee80b4 100644 --- a/third-party/benchmark/src/sysinfo.cc +++ b/third-party/benchmark/src/sysinfo.cc @@ -495,14 +495,14 @@ int GetNumCPUsImpl() { return sysinfo.dwNumberOfProcessors; // number of logical // processors in the current // group -#elif defined(BENCHMARK_OS_SOLARIS) +#elif defined(__linux__) || defined(BENCHMARK_OS_SOLARIS) // Returns -1 in case of a failure. - long num_cpu = sysconf(_SC_NPROCESSORS_ONLN); + int num_cpu = static_cast(sysconf(_SC_NPROCESSORS_ONLN)); if (num_cpu < 0) { PrintErrorAndDie("sysconf(_SC_NPROCESSORS_ONLN) failed with error: ", strerror(errno)); } - return (int)num_cpu; + return num_cpu; #elif defined(BENCHMARK_OS_QNX) return static_cast(_syspage_ptr->num_cpu); #elif defined(BENCHMARK_OS_QURT) @@ -511,53 +511,6 @@ int GetNumCPUsImpl() { hardware_threads.max_hthreads = 1; } return hardware_threads.max_hthreads; -#else - int num_cpus = 0; - int max_id = -1; - std::ifstream f("/proc/cpuinfo"); - if (!f.is_open()) { -PrintErrorAndDie("Failed to open /proc/cpuinfo"); - } -#if defined(__alpha__) - const std::string Key = "cpus detected"; -#else - const std::string Key = "processor"; -#endif - std::string ln; - while (std::getline(f, ln)) { -if (ln.empty()) continue; -std::size_t split_idx = ln.find(':'); -std::string value; -#if defined(__s390__) -// s390 has another format in /proc/cpuinfo -// it needs to be parsed differently -if (split_idx != std::string::npos) - value = ln.substr(Key.size() + 1, split_idx - Key.size() - 1); -#else -if (split_idx != std::string::npos) value = ln.substr(split_idx + 1); -#endif -if (ln.size() >= Key.size() && ln.compare(0, Key.size(), Key) == 0) { - num_cpus++; - if (!value.empty()) { -const int cur_id = benchmark::stoi(value); -max_id = std::max(cur_id, max_id); - } -} - } - if (f.bad()) { -PrintErrorAndDie("Failure reading /proc/cpuinfo"); - } - if (!f.eof()) { -PrintErrorAndDie("Failed to read to end of /proc/cpuinfo"); - } - f.close(); - - if ((max_id + 1) != num_cpus) { -fprintf(stderr, -"CPU ID assignments in /proc/cpuinfo seem messed up." -" This is usually caused by a bad BIOS.\n"); - } - return num_cpus; #endif BENCHMARK_UNREACHABLE(); } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [benchmark] Get number of CPUs with sysconf() on Linux (#125603) (PR #125624)
llvmbot wrote: @llvm/pr-subscribers-third-party-benchmark Author: None (llvmbot) Changes Backport fbe470c1b215e3f953a41db6b91d20ce0bcf5c4e Requested by: @brad0 --- Full diff: https://github.com/llvm/llvm-project/pull/125624.diff 1 Files Affected: - (modified) third-party/benchmark/src/sysinfo.cc (+3-50) ``diff diff --git a/third-party/benchmark/src/sysinfo.cc b/third-party/benchmark/src/sysinfo.cc index 2bed1663af2e955..8283a081ee80b4a 100644 --- a/third-party/benchmark/src/sysinfo.cc +++ b/third-party/benchmark/src/sysinfo.cc @@ -495,14 +495,14 @@ int GetNumCPUsImpl() { return sysinfo.dwNumberOfProcessors; // number of logical // processors in the current // group -#elif defined(BENCHMARK_OS_SOLARIS) +#elif defined(__linux__) || defined(BENCHMARK_OS_SOLARIS) // Returns -1 in case of a failure. - long num_cpu = sysconf(_SC_NPROCESSORS_ONLN); + int num_cpu = static_cast(sysconf(_SC_NPROCESSORS_ONLN)); if (num_cpu < 0) { PrintErrorAndDie("sysconf(_SC_NPROCESSORS_ONLN) failed with error: ", strerror(errno)); } - return (int)num_cpu; + return num_cpu; #elif defined(BENCHMARK_OS_QNX) return static_cast(_syspage_ptr->num_cpu); #elif defined(BENCHMARK_OS_QURT) @@ -511,53 +511,6 @@ int GetNumCPUsImpl() { hardware_threads.max_hthreads = 1; } return hardware_threads.max_hthreads; -#else - int num_cpus = 0; - int max_id = -1; - std::ifstream f("/proc/cpuinfo"); - if (!f.is_open()) { -PrintErrorAndDie("Failed to open /proc/cpuinfo"); - } -#if defined(__alpha__) - const std::string Key = "cpus detected"; -#else - const std::string Key = "processor"; -#endif - std::string ln; - while (std::getline(f, ln)) { -if (ln.empty()) continue; -std::size_t split_idx = ln.find(':'); -std::string value; -#if defined(__s390__) -// s390 has another format in /proc/cpuinfo -// it needs to be parsed differently -if (split_idx != std::string::npos) - value = ln.substr(Key.size() + 1, split_idx - Key.size() - 1); -#else -if (split_idx != std::string::npos) value = ln.substr(split_idx + 1); -#endif -if (ln.size() >= Key.size() && ln.compare(0, Key.size(), Key) == 0) { - num_cpus++; - if (!value.empty()) { -const int cur_id = benchmark::stoi(value); -max_id = std::max(cur_id, max_id); - } -} - } - if (f.bad()) { -PrintErrorAndDie("Failure reading /proc/cpuinfo"); - } - if (!f.eof()) { -PrintErrorAndDie("Failed to read to end of /proc/cpuinfo"); - } - f.close(); - - if ((max_id + 1) != num_cpus) { -fprintf(stderr, -"CPU ID assignments in /proc/cpuinfo seem messed up." -" This is usually caused by a bad BIOS.\n"); - } - return num_cpus; #endif BENCHMARK_UNREACHABLE(); } `` https://github.com/llvm/llvm-project/pull/125624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
joaosaffran wrote: I am not really sure we can have multiple root signatures in the backend. It is possible in HLSL because we can specify the entry function, therefore you can have multiple entries in a single file. However, when lowering into DXContainer, the binary format only allows a single signature to be present. I've reached other members of the team to discuss if this actually the case of if I am missing something. https://github.com/llvm/llvm-project/pull/123147 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
JDevlieghere wrote: If I type `/cherrypick 13ded6829bf7ca793795c50d47dd2b95482e5cfa` will it add that commit to this PR? https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
@@ -829,6 +829,7 @@ option (LLVM_ENABLE_DOXYGEN "Use doxygen to generate llvm API documentation." OF option (LLVM_ENABLE_SPHINX "Use Sphinx to generate llvm documentation." OFF) option (LLVM_ENABLE_OCAMLDOC "Build OCaml bindings documentation." ON) option (LLVM_ENABLE_BINDINGS "Build bindings." ON) +option (LLVM_BUILD_TELEMETRY "Build the telemtry library. This does not enable telemetry." ON) JDevlieghere wrote: @nikic spotted it too. Fixed in 13ded6829bf7ca793795c50d47dd2b95482e5cfa. https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [Clang][ReleaseNotes] Document -fclang-abi-compat=19 re: #110503 (PR #125368)
rjmccall wrote: It's approved anyway. Thanks, Tom. https://github.com/llvm/llvm-project/pull/125368 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions (PR #125484)
evodius96 wrote: This is pertinent to #109930 from Validas as well, and they would like something like this to be included by default to make it less confusing for users of MC/DC. On the other hand, as I point out in the issue, I think introducing MC/DC for single-conditions is redundant (given branch coverage) and introduces unnecessary overhead. So, that's an argument for keeping it as a separate option and perhaps finding another way to work branch coverage into the overall MC/DC metric. Also, Branch Coverage gives you counts for switch statement cases, whereas I'm not sure that makes sense for single-condition MC/DC, though perhaps that's overkill. https://github.com/llvm/llvm-project/pull/125484 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) (PR #125590)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/125590 Backport 7c5100d36d8027dd205d6ec410a63c3930a1d9c1 Requested by: @topperc >From bfc522cfd54c79a8ed833dfbb19285df05c3c4e8 Mon Sep 17 00:00:00 2001 From: Craig Topper Date: Mon, 3 Feb 2025 13:48:42 -0800 Subject: [PATCH] [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) I have been unsuccessful at further reducing the test. The failure requires a shuffle with 2 scalable->fixed extracts with the same source. 0 is the only valid index for a scalable->fixed extract so the 2 sources must be the same extract. Shuffles with the same source are aggressively canonicalized to a unary shuffle. So it requires the extracts to become identical through other optimizations without the shuffle being canonicalized before it is lowered. Fixes #125306. (cherry picked from commit 7c5100d36d8027dd205d6ec410a63c3930a1d9c1) --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 3 +- llvm/test/CodeGen/RISCV/rvv/pr125306.ll | 118 2 files changed, 120 insertions(+), 1 deletion(-) create mode 100644 llvm/test/CodeGen/RISCV/rvv/pr125306.ll diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 8d09e534b1858bc..4ff333b1ff2f7a6 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -4512,7 +4512,8 @@ static SDValue getSingleShuffleSrc(MVT VT, MVT ContainerVT, SDValue V1, // Src needs to have twice the number of elements. unsigned NumElts = VT.getVectorNumElements(); - if (Src.getValueType().getVectorNumElements() != (NumElts * 2)) + if (!Src.getValueType().isFixedLengthVector() || + Src.getValueType().getVectorNumElements() != (NumElts * 2)) return SDValue(); // The extracts must extract the two halves of the source. diff --git a/llvm/test/CodeGen/RISCV/rvv/pr125306.ll b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll new file mode 100644 index 000..111f87de220dbfa --- /dev/null +++ b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll @@ -0,0 +1,118 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc < %s -mtriple=riscv64 -mattr=+m,+v | FileCheck %s + +; Test for an "Invalid size request on a scalable vector". Attempts to reduce +; the test faurther were not successful. The failure requires a shuffle with 2 +; scalable->fixed extracts from the same vector. 0 is the only valid index for a +; scalable->fixed extract so the 2 extract must be the same. Shuffles with the +; same source are aggressively canonicalized to a unary shuffle so it requires +; the extracts to become identical through other optimizations without the +; shuffle being canonicalized before it is lowered. + +define <2 x i32> @main(ptr %0) { +; CHECK-LABEL: main: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT:vsetivli zero, 16, e32, m4, ta, ma +; CHECK-NEXT:vmv.v.i v8, 0 +; CHECK-NEXT:vse32.v v8, (zero) +; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma +; CHECK-NEXT:vmv.v.i v8, 0 +; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma +; CHECK-NEXT:vmv.v.i v10, 0 +; CHECK-NEXT:li a2, 64 +; CHECK-NEXT:sw zero, 80(zero) +; CHECK-NEXT:lui a1, 7 +; CHECK-NEXT:lui a3, 1 +; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma +; CHECK-NEXT:vid.v v11 +; CHECK-NEXT:li a4, 16 +; CHECK-NEXT:lui a5, 2 +; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma +; CHECK-NEXT:vse32.v v10, (a2) +; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma +; CHECK-NEXT:vmv.v.i v10, 0 +; CHECK-NEXT:li a2, 24 +; CHECK-NEXT:sh zero, -392(a3) +; CHECK-NEXT:sh zero, 534(a3) +; CHECK-NEXT:sh zero, 1460(a3) +; CHECK-NEXT:li a3, 32 +; CHECK-NEXT:vse32.v v10, (a2) +; CHECK-NEXT:li a2, 40 +; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma +; CHECK-NEXT:vse32.v v8, (a0) +; CHECK-NEXT:sh zero, -1710(a5) +; CHECK-NEXT:sh zero, -784(a5) +; CHECK-NEXT:sh zero, 142(a5) +; CHECK-NEXT:lw a5, -304(a1) +; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma +; CHECK-NEXT:vadd.vi v9, v11, -1 +; CHECK-NEXT:vse32.v v10, (a3) +; CHECK-NEXT:sh zero, 0(a0) +; CHECK-NEXT:lw a0, -188(a1) +; CHECK-NEXT:vse32.v v10, (a2) +; CHECK-NEXT:lw a2, -188(a1) +; CHECK-NEXT:lw a3, 1244(a1) +; CHECK-NEXT:vmv.v.x v8, a0 +; CHECK-NEXT:lw a0, 1244(a1) +; CHECK-NEXT:lw a1, -304(a1) +; CHECK-NEXT:vmv.v.x v10, a3 +; CHECK-NEXT:vmv.v.x v11, a5 +; CHECK-NEXT:vslide1down.vx v8, v8, zero +; CHECK-NEXT:vslide1down.vx v10, v10, zero +; CHECK-NEXT:vmin.vv v8, v10, v8 +; CHECK-NEXT:vmv.v.x v10, a0 +; CHECK-NEXT:vslide1down.vx v11, v11, zero +; CHECK-NEXT:vmin.vx v10, v10, a2 +; CHECK-NEXT:vmin.vx v10, v10, a1 +; CHECK-NEXT:vmin.vv v11, v8, v11 +; CHECK-NEXT:vmv1r.v v8, v10 +; CHECK-NEXT:vand.vv v9, v11, v9 +; CH
[llvm-branch-commits] [llvm] release/20.x: [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) (PR #125590)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/125590 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [RISCV] Check isFixedLengthVector before calling getVectorNumElements in getSingleShuffleSrc. (#125455) (PR #125590)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: None (llvmbot) Changes Backport 7c5100d36d8027dd205d6ec410a63c3930a1d9c1 Requested by: @topperc --- Full diff: https://github.com/llvm/llvm-project/pull/125590.diff 2 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+2-1) - (added) llvm/test/CodeGen/RISCV/rvv/pr125306.ll (+118) ``diff diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 8d09e534b1858b..4ff333b1ff2f7a 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -4512,7 +4512,8 @@ static SDValue getSingleShuffleSrc(MVT VT, MVT ContainerVT, SDValue V1, // Src needs to have twice the number of elements. unsigned NumElts = VT.getVectorNumElements(); - if (Src.getValueType().getVectorNumElements() != (NumElts * 2)) + if (!Src.getValueType().isFixedLengthVector() || + Src.getValueType().getVectorNumElements() != (NumElts * 2)) return SDValue(); // The extracts must extract the two halves of the source. diff --git a/llvm/test/CodeGen/RISCV/rvv/pr125306.ll b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll new file mode 100644 index 00..111f87de220dbf --- /dev/null +++ b/llvm/test/CodeGen/RISCV/rvv/pr125306.ll @@ -0,0 +1,118 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc < %s -mtriple=riscv64 -mattr=+m,+v | FileCheck %s + +; Test for an "Invalid size request on a scalable vector". Attempts to reduce +; the test faurther were not successful. The failure requires a shuffle with 2 +; scalable->fixed extracts from the same vector. 0 is the only valid index for a +; scalable->fixed extract so the 2 extract must be the same. Shuffles with the +; same source are aggressively canonicalized to a unary shuffle so it requires +; the extracts to become identical through other optimizations without the +; shuffle being canonicalized before it is lowered. + +define <2 x i32> @main(ptr %0) { +; CHECK-LABEL: main: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT:vsetivli zero, 16, e32, m4, ta, ma +; CHECK-NEXT:vmv.v.i v8, 0 +; CHECK-NEXT:vse32.v v8, (zero) +; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma +; CHECK-NEXT:vmv.v.i v8, 0 +; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma +; CHECK-NEXT:vmv.v.i v10, 0 +; CHECK-NEXT:li a2, 64 +; CHECK-NEXT:sw zero, 80(zero) +; CHECK-NEXT:lui a1, 7 +; CHECK-NEXT:lui a3, 1 +; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma +; CHECK-NEXT:vid.v v11 +; CHECK-NEXT:li a4, 16 +; CHECK-NEXT:lui a5, 2 +; CHECK-NEXT:vsetivli zero, 4, e32, m1, ta, ma +; CHECK-NEXT:vse32.v v10, (a2) +; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma +; CHECK-NEXT:vmv.v.i v10, 0 +; CHECK-NEXT:li a2, 24 +; CHECK-NEXT:sh zero, -392(a3) +; CHECK-NEXT:sh zero, 534(a3) +; CHECK-NEXT:sh zero, 1460(a3) +; CHECK-NEXT:li a3, 32 +; CHECK-NEXT:vse32.v v10, (a2) +; CHECK-NEXT:li a2, 40 +; CHECK-NEXT:vsetivli zero, 8, e32, m2, ta, ma +; CHECK-NEXT:vse32.v v8, (a0) +; CHECK-NEXT:sh zero, -1710(a5) +; CHECK-NEXT:sh zero, -784(a5) +; CHECK-NEXT:sh zero, 142(a5) +; CHECK-NEXT:lw a5, -304(a1) +; CHECK-NEXT:vsetivli zero, 2, e32, mf2, ta, ma +; CHECK-NEXT:vadd.vi v9, v11, -1 +; CHECK-NEXT:vse32.v v10, (a3) +; CHECK-NEXT:sh zero, 0(a0) +; CHECK-NEXT:lw a0, -188(a1) +; CHECK-NEXT:vse32.v v10, (a2) +; CHECK-NEXT:lw a2, -188(a1) +; CHECK-NEXT:lw a3, 1244(a1) +; CHECK-NEXT:vmv.v.x v8, a0 +; CHECK-NEXT:lw a0, 1244(a1) +; CHECK-NEXT:lw a1, -304(a1) +; CHECK-NEXT:vmv.v.x v10, a3 +; CHECK-NEXT:vmv.v.x v11, a5 +; CHECK-NEXT:vslide1down.vx v8, v8, zero +; CHECK-NEXT:vslide1down.vx v10, v10, zero +; CHECK-NEXT:vmin.vv v8, v10, v8 +; CHECK-NEXT:vmv.v.x v10, a0 +; CHECK-NEXT:vslide1down.vx v11, v11, zero +; CHECK-NEXT:vmin.vx v10, v10, a2 +; CHECK-NEXT:vmin.vx v10, v10, a1 +; CHECK-NEXT:vmin.vv v11, v8, v11 +; CHECK-NEXT:vmv1r.v v8, v10 +; CHECK-NEXT:vand.vv v9, v11, v9 +; CHECK-NEXT:vslideup.vi v8, v10, 1 +; CHECK-NEXT:vse32.v v9, (a4) +; CHECK-NEXT:sh zero, 0(zero) +; CHECK-NEXT:ret +entry: + store <16 x i32> zeroinitializer, ptr null, align 4 + store <8 x i32> zeroinitializer, ptr %0, align 4 + store <4 x i32> zeroinitializer, ptr getelementptr inbounds nuw (i8, ptr null, i64 64), align 4 + store i32 0, ptr getelementptr inbounds nuw (i8, ptr null, i64 80), align 4 + %1 = load i32, ptr getelementptr inbounds nuw (i8, ptr null, i64 29916), align 4 + %broadcast.splatinsert53 = insertelement <4 x i32> zeroinitializer, i32 %1, i64 0 + %2 = load i32, ptr getelementptr inbounds nuw (i8, ptr null, i64 28484), align 4 + %broadcast.splatinsert55 = insertelement <4 x i32> zeroinitializer, i32 %2, i64 0 + %3 = call <4 x i32> @llvm.smin.v4i32(<4 x
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
@@ -2612,7 +2612,54 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) { - TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct"); + mlir::Location loc = converter.genLocation(declareMapperConstruct.source); + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + lower::StatementContext stmtCtx; + const auto &spec = + std::get(declareMapperConstruct.t); + const auto &mapperName{std::get>(spec.t)}; + const auto &varType{std::get(spec.t)}; + const auto &varName{std::get(spec.t)}; + assert(varType.declTypeSpec->category() == + semantics::DeclTypeSpec::Category::TypeDerived && + "Expected derived type"); + + std::string mapperNameStr; + if (mapperName.has_value()) { +mapperNameStr = mapperName->ToString(); +mapperNameStr = +converter.mangleName(mapperNameStr, mapperName->symbol->owner()); + } else { +mapperNameStr = +varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; +mapperNameStr = converter.mangleName( +mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); + } + + // Save insert point just after the DeclMapperOp. + mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint(); + + firOpBuilder.setInsertionPointToStart(converter.getModuleOp().getBody()); + auto mlirType = converter.genType(varType.declTypeSpec->derivedTypeSpec()); + auto declMapperOp = firOpBuilder.create( + loc, mapperNameStr, mlirType); + converter.getMLIRSymbolTable()->insert(declMapperOp); + auto ®ion = declMapperOp.getRegion(); + firOpBuilder.createBlock(®ion); + auto varVal = region.addArgument(firOpBuilder.getRefType(mlirType), loc); + converter.bindSymbol(*varName.symbol, varVal); + + // Populate the declareMapper region with the map information. + mlir::omp::DeclareMapperInfoOperands clauseOps; + const auto *clauseList{ + parser::Unwrap(declareMapperConstruct.t)}; + List clauses = makeClauses(*clauseList, semaCtx); + ClauseProcessor cp(converter, semaCtx, clauses); + cp.processMap(loc, stmtCtx, clauseOps); + firOpBuilder.create(loc, clauseOps.mapVars); + + // Restore the insert point to just after the DeclareMapperOp. + firOpBuilder.restoreInsertionPoint(insPt); tblah wrote: With my change to use the insertion point guard above ```suggestion ``` https://github.com/llvm/llvm-project/pull/117046 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/117046 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
@@ -2612,7 +2612,54 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) { - TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct"); + mlir::Location loc = converter.genLocation(declareMapperConstruct.source); + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + lower::StatementContext stmtCtx; + const auto &spec = + std::get(declareMapperConstruct.t); + const auto &mapperName{std::get>(spec.t)}; + const auto &varType{std::get(spec.t)}; + const auto &varName{std::get(spec.t)}; + assert(varType.declTypeSpec->category() == + semantics::DeclTypeSpec::Category::TypeDerived && + "Expected derived type"); + + std::string mapperNameStr; + if (mapperName.has_value()) { +mapperNameStr = mapperName->ToString(); +mapperNameStr = +converter.mangleName(mapperNameStr, mapperName->symbol->owner()); + } else { +mapperNameStr = +varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; +mapperNameStr = converter.mangleName( +mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); + } + + // Save insert point just after the DeclMapperOp. + mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint(); tblah wrote: ```suggestion // Save current insertion point before moving to the module scope to create the DeclareMapperOp mlir::OpBuilder::InsertionGuard guard(builder); ``` I don't think it makes sense to say the insert point is before or after the DeclMapperOp because the DeclMapperOp will be at module scope. I've also changed to use the insertion point guard because it is less error prone if a conditional early return is added later. https://github.com/llvm/llvm-project/pull/117046 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
https://github.com/tblah commented: I have some minor suggestions on the code. Please wait for review from somebody with more familiarity with omp target things, and this is conditional on the design of the MLIR operation being approved. https://github.com/llvm/llvm-project/pull/117046 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] release/20.x: [ELF] Refine isExported/isPreemptible condition (PR #125334)
nikic wrote: Reverted in https://github.com/llvm/llvm-project/commit/b84f7d17f84030092880857544e13d26a2507c62, as this has been failing all pre-merge tests for the last two or three days already. https://github.com/llvm/llvm-project/pull/125334 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] NFC: rename MatchedPackOnParmToNonPackOnArg to StrictPackMatch (PR #125418)
https://github.com/cor3ntin edited https://github.com/llvm/llvm-project/pull/125418 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
TIFitis wrote: Polite request for review 🙂 https://github.com/llvm/llvm-project/pull/117046 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
@@ -2612,7 +2612,54 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) { - TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct"); + mlir::Location loc = converter.genLocation(declareMapperConstruct.source); + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + lower::StatementContext stmtCtx; + const auto &spec = + std::get(declareMapperConstruct.t); + const auto &mapperName{std::get>(spec.t)}; + const auto &varType{std::get(spec.t)}; + const auto &varName{std::get(spec.t)}; + assert(varType.declTypeSpec->category() == + semantics::DeclTypeSpec::Category::TypeDerived && + "Expected derived type"); + + std::string mapperNameStr; + if (mapperName.has_value()) { +mapperNameStr = mapperName->ToString(); +mapperNameStr = +converter.mangleName(mapperNameStr, mapperName->symbol->owner()); + } else { +mapperNameStr = +varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; +mapperNameStr = converter.mangleName( +mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); + } + + // Save insert point just after the DeclMapperOp. + mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint(); TIFitis wrote: Done :) https://github.com/llvm/llvm-project/pull/117046 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)
@@ -80,6 +85,99 @@ class RootSignatureLexer { } }; +class RootSignatureParser { +public: + RootSignatureParser(SmallVector &Elements, + const SmallVector &Tokens, + DiagnosticsEngine &Diags); + + // Iterates over the provided tokens and constructs the in-memory + // representations of the RootElements. + // + // The return value denotes if there was a failure and the method will + // return on the first encountered failure, or, return false if it + // can sucessfully reach the end of the tokens. bogner wrote: If you use `///` instead of `//` for the doc comments on the methods of this class they'll show up in the [LLVM doxygen](https://llvm.org/doxygen/). https://github.com/llvm/llvm-project/pull/122982 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)
@@ -15,16 +15,21 @@ #include "clang/AST/APValue.h" #include "clang/Basic/DiagnosticLex.h" +#include "clang/Basic/DiagnosticParse.h" #include "clang/Lex/LiteralSupport.h" #include "clang/Lex/Preprocessor.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" +#include "llvm/Frontend/HLSL/HLSLRootSignature.h" + namespace clang { namespace hlsl { +namespace rs = llvm::hlsl::root_signature; bogner wrote: I'm not convinced the brevity this affords later is really worth it, and since this is in a public header it introduces `llvm::hlsl::rs` into far more scopes than just this file, which IMO is just a recipe for confusion. https://github.com/llvm/llvm-project/pull/122982 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)
@@ -80,6 +85,99 @@ class RootSignatureLexer { } }; +class RootSignatureParser { +public: + RootSignatureParser(SmallVector &Elements, + const SmallVector &Tokens, + DiagnosticsEngine &Diags); + + // Iterates over the provided tokens and constructs the in-memory + // representations of the RootElements. + // + // The return value denotes if there was a failure and the method will + // return on the first encountered failure, or, return false if it + // can sucessfully reach the end of the tokens. + bool Parse(); + +private: + // Root Element helpers + bool ParseRootElement(bool First); + bool ParseDescriptorTable(); + bool ParseDescriptorTableClause(); + + // Helper dispatch method + // + // These will switch on the Variant kind to dispatch to the respective Parse + // method and store the parsed value back into Ref. + // + // It is helpful to have a generalized dispatch method so that when we need + // to parse multiple optional parameters in any order, we can invoke this + // method + bool ParseParam(rs::ParamType Ref); + + // Parse as many optional parameters as possible in any order + bool + ParseOptionalParams(llvm::SmallDenseMap RefMap); bogner wrote: Do you really want to be passing a `SmallDenseMap` by value here? This and the other similar APIs should probably be using references. https://github.com/llvm/llvm-project/pull/122982 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)
@@ -148,6 +148,347 @@ bool RootSignatureLexer::LexToken(RootSignatureToken &Result) { return false; } +// Parser Definitions + +RootSignatureParser::RootSignatureParser( +SmallVector &Elements, +const SmallVector &Tokens) +: Elements(Elements) { + CurTok = Tokens.begin(); + LastTok = Tokens.end(); +} + +bool RootSignatureParser::ReportError() { return true; } + +bool RootSignatureParser::Parse() { + // Handle edge-case of empty RootSignature() + if (CurTok == LastTok) +return false; + + // Iterate as many RootElements as possible + bool HasComma = true; + while (HasComma && + IsCurExpectedToken(ArrayRef{TokenKind::kw_DescriptorTable})) { +if (ParseRootElement()) + return true; +HasComma = !TryConsumeExpectedToken(TokenKind::pu_comma); +if (HasComma) + ConsumeNextToken(); + } + + if (HasComma) +return ReportError(); // report 'comma' denotes a required extra item + + // Ensure that we are at the end of the tokens + CurTok++; + if (CurTok != LastTok) +return ReportError(); // report expected end of input but got more + return false; +} + +bool RootSignatureParser::ParseRootElement() { + // Dispatch onto the correct parse method + switch (CurTok->Kind) { + case TokenKind::kw_DescriptorTable: +return ParseDescriptorTable(); + default: +llvm_unreachable("Switch for an expected token was not provided"); +return true; bogner wrote: `llvm_unreachable` doesn't return (it is annotated as a "noreturn" function), so the `return true` is unreachable code here. This won't generally lead to the parser erroring out in practice, because the program will crash or hit UB due to the `llvm_unreachable`. Given that we expect to have actual error handling for the token kind being appropriate just above, this is probably okay, but it does mean that the `return true` is unnecessary. https://github.com/llvm/llvm-project/pull/122982 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/12 Backport bac62ee Requested by: @JDevlieghere >From fa13ea757382a003468e2c30d978eba0a1dfcbf5 Mon Sep 17 00:00:00 2001 From: Jonas Devlieghere Date: Mon, 3 Feb 2025 10:35:14 -0800 Subject: [PATCH] [llvm] Add CMake flag to compile out the telemetry framework (#124850) Add a CMake flag (LLVM_BUILD_TELEMETRY) to disable building the telemetry framework. The flag being enabled does *not* mean that telemetry is being collected, it merely means we're building the generic telemetry framework. Hence the flag is enabled by default. Motivated by this Discourse thread: https://discourse.llvm.org/t/how-to-disable-building-llvm-clang-telemetry/84305 (cherry picked from commit bac62ee5b473e70981a6bd9759ec316315fca07d) --- llvm/CMakeLists.txt | 1 + llvm/lib/CMakeLists.txt | 4 +++- llvm/unittests/CMakeLists.txt | 4 +++- 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt index c9ff3696e22d698..d1b4c2700ce8ef7 100644 --- a/llvm/CMakeLists.txt +++ b/llvm/CMakeLists.txt @@ -829,6 +829,7 @@ option (LLVM_ENABLE_DOXYGEN "Use doxygen to generate llvm API documentation." OF option (LLVM_ENABLE_SPHINX "Use Sphinx to generate llvm documentation." OFF) option (LLVM_ENABLE_OCAMLDOC "Build OCaml bindings documentation." ON) option (LLVM_ENABLE_BINDINGS "Build bindings." ON) +option (LLVM_BUILD_TELEMETRY "Build the telemtry library. This does not enable telemetry." ON) set(LLVM_INSTALL_DOXYGEN_HTML_DIR "${CMAKE_INSTALL_DOCDIR}/llvm/doxygen-html" CACHE STRING "Doxygen-generated HTML documentation install directory") diff --git a/llvm/lib/CMakeLists.txt b/llvm/lib/CMakeLists.txt index f6465612d30c0b4..d0a2bc929438179 100644 --- a/llvm/lib/CMakeLists.txt +++ b/llvm/lib/CMakeLists.txt @@ -41,7 +41,9 @@ add_subdirectory(ProfileData) add_subdirectory(Passes) add_subdirectory(TargetParser) add_subdirectory(TextAPI) -add_subdirectory(Telemetry) +if (LLVM_BUILD_TELEMETRY) + add_subdirectory(Telemetry) +endif() add_subdirectory(ToolDrivers) add_subdirectory(XRay) if (LLVM_INCLUDE_TESTS) diff --git a/llvm/unittests/CMakeLists.txt b/llvm/unittests/CMakeLists.txt index 81abce51b8939f0..12e229b1c349840 100644 --- a/llvm/unittests/CMakeLists.txt +++ b/llvm/unittests/CMakeLists.txt @@ -63,7 +63,9 @@ add_subdirectory(Support) add_subdirectory(TableGen) add_subdirectory(Target) add_subdirectory(TargetParser) -add_subdirectory(Telemetry) +if (LLVM_BUILD_TELEMETRY) + add_subdirectory(Telemetry) +endif() add_subdirectory(Testing) add_subdirectory(TextAPI) add_subdirectory(Transforms) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
llvmbot wrote: @oontvoo What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm] Add CMake flag to compile out the telemetry framework (#124850) (PR #125555)
https://github.com/oontvoo approved this pull request. https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)
tstellar wrote: It had the 'needs review' status and I missed it. We can get it into -rc2. https://github.com/llvm/llvm-project/pull/125057 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][OpenMP] Parse METADIRECTIVE in specification part (PR #123397)
https://github.com/kiranchandramohan approved this pull request. LG. Declarative directives have to be propagated to module files but for the purpose of generating TODOs, this is not required. https://github.com/llvm/llvm-project/pull/123397 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur ready_for_review https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)
=?utf-8?q?Michał_Górny?= Message-ID: In-Reply-To: https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/125498 Backport 359a9131704277bce0f806de31ac887e68a66902 689ef5fda0ab07dfc452cb16d3646d53e612cb75 Requested by: @mgorny >From 94ba87f5c8faafa63ece54849362bab8a168ae00 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= Date: Sun, 2 Feb 2025 16:55:22 +0100 Subject: [PATCH 1/2] [offload] `gnu::format` with variadic template functions is Clang-only (#124406) Use `gnu::format` attribute only when compiling with Clang, as using it against variadic template functions is a Clang extension and is not supported by GCC. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 Fixes #119069 (cherry picked from commit 359a9131704277bce0f806de31ac887e68a66902) --- .../common/include/ErrorReporting.h| 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/offload/plugins-nextgen/common/include/ErrorReporting.h b/offload/plugins-nextgen/common/include/ErrorReporting.h index 8478977a8f86af0..2ad0f2b7dd6c651 100644 --- a/offload/plugins-nextgen/common/include/ErrorReporting.h +++ b/offload/plugins-nextgen/common/include/ErrorReporting.h @@ -80,8 +80,10 @@ class ErrorReporter { /// Print \p Format, instantiated with \p Args to stderr. /// TODO: Allow redirection into a file stream. template - [[gnu::format(__printf__, 1, 2)]] static void print(const char *Format, - ArgsTy &&...Args) { +#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 + [[gnu::format(__printf__, 1, 2)]] +#endif + static void print(const char *Format, ArgsTy &&...Args) { raw_fd_ostream OS(STDERR_FILENO, false); OS << llvm::format(Format, Args...); } @@ -89,8 +91,10 @@ class ErrorReporter { /// Print \p Format, instantiated with \p Args to stderr, but colored. /// TODO: Allow redirection into a file stream. template - [[gnu::format(__printf__, 2, 3)]] static void - print(ColorTy Color, const char *Format, ArgsTy &&...Args) { +#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 + [[gnu::format(__printf__, 2, 3)]] +#endif + static void print(ColorTy Color, const char *Format, ArgsTy &&...Args) { raw_fd_ostream OS(STDERR_FILENO, false); WithColor(OS, HighlightColor(Color)) << llvm::format(Format, Args...); } @@ -99,8 +103,10 @@ class ErrorReporter { /// a banner. /// TODO: Allow redirection into a file stream. template - [[gnu::format(__printf__, 1, 2)]] static void reportError(const char *Format, -ArgsTy &&...Args) { +#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 + [[gnu::format(__printf__, 1, 2)]] +#endif + static void reportError(const char *Format, ArgsTy &&...Args) { print(BoldRed, "%s", ErrorBanner); print(BoldRed, Format, Args...); print("\n"); >From 1225c2eaf8c281ad9f83b49645779930c7dc2284 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= Date: Sun, 2 Feb 2025 16:55:39 +0100 Subject: [PATCH 2/2] [offload] [test] Use test compiler ID rather than host (#124408) Use the test compiler ID to verify whether tests can be run rather than the host compiler. This makes it possible to run tests (with Clang) while the library itself was built with GCC. (cherry picked from commit 689ef5fda0ab07dfc452cb16d3646d53e612cb75) --- offload/test/CMakeLists.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/offload/test/CMakeLists.txt b/offload/test/CMakeLists.txt index 8a827e0a625eff0..4768d9ccf223bb4 100644 --- a/offload/test/CMakeLists.txt +++ b/offload/test/CMakeLists.txt @@ -1,6 +1,6 @@ # CMakeLists.txt file for unit testing OpenMP offloading runtime library. -if(NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang" OR - CMAKE_CXX_COMPILER_VERSION VERSION_LESS 6.0.0) +if(NOT OPENMP_TEST_COMPILER_ID STREQUAL "Clang" OR + OPENMP_TEST_COMPILER_VERSION VERSION_LESS 6.0.0) message(STATUS "Can only test with Clang compiler in version 6.0.0 or later.") message(WARNING "The check-offload target will not be available!") return() ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)
=?utf-8?q?Michał_Górny?= Message-ID: In-Reply-To: https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/125498 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)
=?utf-8?q?Michał_Górny?= Message-ID: In-Reply-To: llvmbot wrote: @thesamesam @thesamesam What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/125498 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)
=?utf-8?q?Micha=C5=82_G=C3=B3rny?= Message-ID: In-Reply-To: https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/125498 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [offload] [test] Use test compiler ID rather than host (#124408) (PR #125498)
=?utf-8?q?Michał_Górny?= Message-ID: In-Reply-To: llvmbot wrote: @llvm/pr-subscribers-offload Author: None (llvmbot) Changes Backport 359a9131704277bce0f806de31ac887e68a66902 689ef5fda0ab07dfc452cb16d3646d53e612cb75 Requested by: @mgorny --- Full diff: https://github.com/llvm/llvm-project/pull/125498.diff 2 Files Affected: - (modified) offload/plugins-nextgen/common/include/ErrorReporting.h (+12-6) - (modified) offload/test/CMakeLists.txt (+2-2) ``diff diff --git a/offload/plugins-nextgen/common/include/ErrorReporting.h b/offload/plugins-nextgen/common/include/ErrorReporting.h index 8478977a8f86af0..2ad0f2b7dd6c651 100644 --- a/offload/plugins-nextgen/common/include/ErrorReporting.h +++ b/offload/plugins-nextgen/common/include/ErrorReporting.h @@ -80,8 +80,10 @@ class ErrorReporter { /// Print \p Format, instantiated with \p Args to stderr. /// TODO: Allow redirection into a file stream. template - [[gnu::format(__printf__, 1, 2)]] static void print(const char *Format, - ArgsTy &&...Args) { +#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 + [[gnu::format(__printf__, 1, 2)]] +#endif + static void print(const char *Format, ArgsTy &&...Args) { raw_fd_ostream OS(STDERR_FILENO, false); OS << llvm::format(Format, Args...); } @@ -89,8 +91,10 @@ class ErrorReporter { /// Print \p Format, instantiated with \p Args to stderr, but colored. /// TODO: Allow redirection into a file stream. template - [[gnu::format(__printf__, 2, 3)]] static void - print(ColorTy Color, const char *Format, ArgsTy &&...Args) { +#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 + [[gnu::format(__printf__, 2, 3)]] +#endif + static void print(ColorTy Color, const char *Format, ArgsTy &&...Args) { raw_fd_ostream OS(STDERR_FILENO, false); WithColor(OS, HighlightColor(Color)) << llvm::format(Format, Args...); } @@ -99,8 +103,10 @@ class ErrorReporter { /// a banner. /// TODO: Allow redirection into a file stream. template - [[gnu::format(__printf__, 1, 2)]] static void reportError(const char *Format, -ArgsTy &&...Args) { +#ifdef __clang__ // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958 + [[gnu::format(__printf__, 1, 2)]] +#endif + static void reportError(const char *Format, ArgsTy &&...Args) { print(BoldRed, "%s", ErrorBanner); print(BoldRed, Format, Args...); print("\n"); diff --git a/offload/test/CMakeLists.txt b/offload/test/CMakeLists.txt index 8a827e0a625eff0..4768d9ccf223bb4 100644 --- a/offload/test/CMakeLists.txt +++ b/offload/test/CMakeLists.txt @@ -1,6 +1,6 @@ # CMakeLists.txt file for unit testing OpenMP offloading runtime library. -if(NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang" OR - CMAKE_CXX_COMPILER_VERSION VERSION_LESS 6.0.0) +if(NOT OPENMP_TEST_COMPILER_ID STREQUAL "Clang" OR + OPENMP_TEST_COMPILER_VERSION VERSION_LESS 6.0.0) message(STATUS "Can only test with Clang compiler in version 6.0.0 or later.") message(WARNING "The check-offload target will not be available!") return() `` https://github.com/llvm/llvm-project/pull/125498 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)
https://github.com/tblah approved this pull request. https://github.com/llvm/llvm-project/pull/125515 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [MC/DC] Enable usage of `!` among `&&` and `||` (PR #125406)
https://github.com/chapuni updated https://github.com/llvm/llvm-project/pull/125406 >From f2cf50e10b59d7d461967baef4d589c9282d0f6d Mon Sep 17 00:00:00 2001 From: NAKAMURA Takumi Date: Sun, 2 Feb 2025 22:11:51 +0900 Subject: [PATCH 1/2] [MC/DC] Enable usage of `!` among `&&` and `||` In the current implementation, `!(a || b) && c` was not treated as one Decision with three terms. Fixes #124563 --- clang/include/clang/AST/IgnoreExpr.h | 8 ++ clang/lib/CodeGen/CodeGenFunction.cpp | 12 ++- clang/lib/CodeGen/CodeGenPGO.cpp | 8 +- clang/lib/CodeGen/CoverageMappingGen.cpp | 12 +++ .../test/CoverageMapping/mcdc-nested-expr.cpp | 6 +- clang/test/Profile/c-mcdc-not.c | 95 +++ 6 files changed, 132 insertions(+), 9 deletions(-) diff --git a/clang/include/clang/AST/IgnoreExpr.h b/clang/include/clang/AST/IgnoreExpr.h index 917bada61fa6fdd..c48c0c0daf81517 100644 --- a/clang/include/clang/AST/IgnoreExpr.h +++ b/clang/include/clang/AST/IgnoreExpr.h @@ -134,6 +134,14 @@ inline Expr *IgnoreElidableImplicitConstructorSingleStep(Expr *E) { return E; } +inline Expr *IgnoreUOpLNotSingleStep(Expr *E) { + if (auto *UO = dyn_cast(E)) { +if (UO->getOpcode() == UO_LNot) + return UO->getSubExpr(); + } + return E; +} + inline Expr *IgnoreImplicitAsWrittenSingleStep(Expr *E) { if (auto *ICE = dyn_cast(E)) return ICE->getSubExprAsWritten(); diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index bbef277a524480b..2c380ac926b1e70 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -27,6 +27,7 @@ #include "clang/AST/Decl.h" #include "clang/AST/DeclCXX.h" #include "clang/AST/Expr.h" +#include "clang/AST/IgnoreExpr.h" #include "clang/AST/StmtCXX.h" #include "clang/AST/StmtObjC.h" #include "clang/Basic/Builtins.h" @@ -1748,12 +1749,13 @@ bool CodeGenFunction::ConstantFoldsToSimpleInteger(const Expr *Cond, /// Strip parentheses and simplistic logical-NOT operators. const Expr *CodeGenFunction::stripCond(const Expr *C) { - while (const UnaryOperator *Op = dyn_cast(C->IgnoreParens())) { -if (Op->getOpcode() != UO_LNot) - break; -C = Op->getSubExpr(); + while (true) { +const Expr *SC = +IgnoreExprNodes(C, IgnoreParensSingleStep, IgnoreUOpLNotSingleStep); +if (C == SC) + return SC; +C = SC; } - return C->IgnoreParens(); } /// Determine whether the given condition is an instrumentable condition diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp index 792373839107f0a..0fd49b880bba305 100644 --- a/clang/lib/CodeGen/CodeGenPGO.cpp +++ b/clang/lib/CodeGen/CodeGenPGO.cpp @@ -247,8 +247,9 @@ struct MapRegionCounters : public RecursiveASTVisitor { } if (const Expr *E = dyn_cast(S)) { - const BinaryOperator *BinOp = dyn_cast(E->IgnoreParens()); - if (BinOp && BinOp->isLogicalOp()) { + if (const auto *BinOp = + dyn_cast(CodeGenFunction::stripCond(E)); + BinOp && BinOp->isLogicalOp()) { /// Check for "split-nested" logical operators. This happens when a new /// boolean expression logical-op nest is encountered within an existing /// boolean expression, separated by a non-logical operator. For @@ -280,7 +281,8 @@ struct MapRegionCounters : public RecursiveASTVisitor { return true; if (const Expr *E = dyn_cast(S)) { - const BinaryOperator *BinOp = dyn_cast(E->IgnoreParens()); + const BinaryOperator *BinOp = + dyn_cast(CodeGenFunction::stripCond(E)); if (BinOp && BinOp->isLogicalOp()) { assert(LogOpStack.back() == BinOp); LogOpStack.pop_back(); diff --git a/clang/lib/CodeGen/CoverageMappingGen.cpp b/clang/lib/CodeGen/CoverageMappingGen.cpp index f09157771d2b5c0..9bf73cf27a5fa9a 100644 --- a/clang/lib/CodeGen/CoverageMappingGen.cpp +++ b/clang/lib/CodeGen/CoverageMappingGen.cpp @@ -799,6 +799,12 @@ struct MCDCCoverageBuilder { /// Return the LHS Decision ([0,0] if not set). const mcdc::ConditionIDs &back() const { return DecisionStack.back(); } + void swapConds() { +if (DecisionStack.empty()) + return; + +std::swap(DecisionStack.back()[false], DecisionStack.back()[true]); + } /// Push the binary operator statement to track the nest level and assign IDs /// to the operator's LHS and RHS. The RHS may be a larger subtree that is /// broken up on successive levels. @@ -2241,6 +2247,12 @@ struct CounterCoverageMappingBuilder SM.isInSystemHeader(SM.getSpellingLoc(E->getEndLoc(; } + void VisitUnaryLNot(const UnaryOperator *E) { +MCDCBuilder.swapConds(); +Visit(E->getSubExpr()); +MCDCBuilder.swapConds(); + } + void VisitBinLAnd(const BinaryOperator *E) { if (isExprInSystemHeader(E)) { LeafExprSet.insert(E); diff --git a/clang/test/CoverageMapping/mcdc-nested-expr.cp
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
llvmbot wrote: @llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-regalloc @llvm/pr-subscribers-backend-powerpc Author: Matt Arsenault (arsenm) Changes This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. --- Patch is 76.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125535.diff 9 Files Affected: - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (-24) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (-5) - (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+22-22) - (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+18-18) - (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+91-91) - (modified) llvm/test/CodeGen/AMDGPU/load-global-i32.ll (+42-43) - (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-fold-reg-sequence-subreg.mir (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+239-237) ``diff diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 6fc57dec6a8264..71c720ed09b5fb 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned OpType) const { OpType <= AMDGPU::OPERAND_SRC_LAST; } -bool SIRegisterInfo::shouldRewriteCopySrc( - const TargetRegisterClass *DefRC, - unsigned DefSubReg, - const TargetRegisterClass *SrcRC, - unsigned SrcSubReg) const { - // We want to prefer the smallest register class possible, so we don't want to - // stop and rewrite on anything that looks like a subregister - // extract. Operations mostly don't care about the super register class, so we - // only want to stop on the most basic of copies between the same register - // class. - // - // e.g. if we have something like - // %0 = ... - // %1 = ... - // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2 - // %3 = COPY %2, sub0 - // - // We want to look through the COPY to find: - // => %3 = COPY %0 - - // Plain copy. - return getCommonSubClass(DefRC, SrcRC) != nullptr; -} - bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const { // TODO: 64-bit operands have extending behavior from 32-bit literal. return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST && diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h index 8e481e3ac23043..a434efb70d0525 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h @@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo { const TargetRegisterClass *SubRC, unsigned SubIdx) const; - bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC, -unsigned DefSubReg, -const TargetRegisterClass *SrcRC, -unsigned SrcSubReg) const override; - /// \returns True if operands defined with this operand type can accept /// a literal constant (i.e. any 32-bit immediate). bool opCanUseLiteralConstant(unsigned OpType) const; diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll index c6c0b9cf8f027f..cc2f775ff22bc5 100644 --- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll +++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll @@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr addrspace(1) %out0, ptr addrspa define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr addrspace(1) %in) nounwind { ; SI-LABEL: test_copy_v4i8_x4: ; SI: ; %bb.0: -; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11 -; SI-NEXT:s_mov_b32 s3, 0xf000 -; SI-NEXT:s_mov_b32 s10, 0 -; SI-NEXT:s_mov_b32 s11, s3 +; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11 +; SI-NEXT:s_mov_b32 s11, 0xf000 +; SI-NEXT:s_mov_b32 s2, 0 +; SI-NEXT:s_mov_b32 s3, s11 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; SI-NEXT:v_mov_b32_e32 v1, 0 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64 -; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9 -; SI-NEXT:s_mov_b32 s2, -1 -; SI-NEXT:s_mov_b32 s14, s2 -; SI-NEXT:s_mov_b32 s15, s3 -; SI-NEXT:s_mov_b32 s18, s2 +; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64 +; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9 +; SI-NEXT:s_mov_b32 s10, -1 +; SI-NEXT:s_mov_b32 s14, s10 +; SI-NEXT:s_mov_b32 s15, s11 +; SI-NEXT:s_mov_b32 s18, s10 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:s_mov_b32 s0, s4 -; SI-NEXT:s_mov_b32 s1, s5 -; SI-NEXT:s_mov_b32 s19, s3 -; SI-NE
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: Matt Arsenault (arsenm) Changes This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. --- Patch is 76.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125535.diff 9 Files Affected: - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (-24) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (-5) - (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+22-22) - (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+18-18) - (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+91-91) - (modified) llvm/test/CodeGen/AMDGPU/load-global-i32.ll (+42-43) - (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-fold-reg-sequence-subreg.mir (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+239-237) ``diff diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 6fc57dec6a8264..71c720ed09b5fb 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned OpType) const { OpType <= AMDGPU::OPERAND_SRC_LAST; } -bool SIRegisterInfo::shouldRewriteCopySrc( - const TargetRegisterClass *DefRC, - unsigned DefSubReg, - const TargetRegisterClass *SrcRC, - unsigned SrcSubReg) const { - // We want to prefer the smallest register class possible, so we don't want to - // stop and rewrite on anything that looks like a subregister - // extract. Operations mostly don't care about the super register class, so we - // only want to stop on the most basic of copies between the same register - // class. - // - // e.g. if we have something like - // %0 = ... - // %1 = ... - // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2 - // %3 = COPY %2, sub0 - // - // We want to look through the COPY to find: - // => %3 = COPY %0 - - // Plain copy. - return getCommonSubClass(DefRC, SrcRC) != nullptr; -} - bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const { // TODO: 64-bit operands have extending behavior from 32-bit literal. return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST && diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h index 8e481e3ac23043..a434efb70d0525 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h @@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo { const TargetRegisterClass *SubRC, unsigned SubIdx) const; - bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC, -unsigned DefSubReg, -const TargetRegisterClass *SrcRC, -unsigned SrcSubReg) const override; - /// \returns True if operands defined with this operand type can accept /// a literal constant (i.e. any 32-bit immediate). bool opCanUseLiteralConstant(unsigned OpType) const; diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll index c6c0b9cf8f027f..cc2f775ff22bc5 100644 --- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll +++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll @@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr addrspace(1) %out0, ptr addrspa define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr addrspace(1) %in) nounwind { ; SI-LABEL: test_copy_v4i8_x4: ; SI: ; %bb.0: -; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11 -; SI-NEXT:s_mov_b32 s3, 0xf000 -; SI-NEXT:s_mov_b32 s10, 0 -; SI-NEXT:s_mov_b32 s11, s3 +; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11 +; SI-NEXT:s_mov_b32 s11, 0xf000 +; SI-NEXT:s_mov_b32 s2, 0 +; SI-NEXT:s_mov_b32 s3, s11 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; SI-NEXT:v_mov_b32_e32 v1, 0 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64 -; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9 -; SI-NEXT:s_mov_b32 s2, -1 -; SI-NEXT:s_mov_b32 s14, s2 -; SI-NEXT:s_mov_b32 s15, s3 -; SI-NEXT:s_mov_b32 s18, s2 +; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64 +; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9 +; SI-NEXT:s_mov_b32 s10, -1 +; SI-NEXT:s_mov_b32 s14, s10 +; SI-NEXT:s_mov_b32 s15, s11 +; SI-NEXT:s_mov_b32 s18, s10 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:s_mov_b32 s0, s4 -; SI-NEXT:s_mov_b32 s1, s5 -; SI-NEXT:s_mov_b32 s19, s3 -; SI-NEXT:s_mov_b32 s22, s2 -; SI-NEXT:s_mov_b32 s23, s3 -; SI-NEXT
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/125535 This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. >From e5479afa758aadd545028780e8a5ab3bd119e028 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Thu, 23 Jan 2025 14:39:10 +0700 Subject: [PATCH] AMDGPU: Use default shouldRewriteCopySrc This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. --- llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | 24 - llvm/lib/Target/AMDGPU/SIRegisterInfo.h | 5 - llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll | 44 +- llvm/test/CodeGen/AMDGPU/ctpop64.ll | 36 +- llvm/test/CodeGen/AMDGPU/idot2.ll | 182 +++ llvm/test/CodeGen/AMDGPU/load-global-i32.ll | 85 ++-- .../peephole-opt-fold-reg-sequence-subreg.mir | 8 +- .../AMDGPU/peephole-opt-regseq-removal.mir| 4 +- .../CodeGen/AMDGPU/spill-scavenge-offset.ll | 476 +- 9 files changed, 418 insertions(+), 446 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 6fc57dec6a8264..71c720ed09b5fb 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned OpType) const { OpType <= AMDGPU::OPERAND_SRC_LAST; } -bool SIRegisterInfo::shouldRewriteCopySrc( - const TargetRegisterClass *DefRC, - unsigned DefSubReg, - const TargetRegisterClass *SrcRC, - unsigned SrcSubReg) const { - // We want to prefer the smallest register class possible, so we don't want to - // stop and rewrite on anything that looks like a subregister - // extract. Operations mostly don't care about the super register class, so we - // only want to stop on the most basic of copies between the same register - // class. - // - // e.g. if we have something like - // %0 = ... - // %1 = ... - // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2 - // %3 = COPY %2, sub0 - // - // We want to look through the COPY to find: - // => %3 = COPY %0 - - // Plain copy. - return getCommonSubClass(DefRC, SrcRC) != nullptr; -} - bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const { // TODO: 64-bit operands have extending behavior from 32-bit literal. return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST && diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h index 8e481e3ac23043..a434efb70d0525 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h @@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo { const TargetRegisterClass *SubRC, unsigned SubIdx) const; - bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC, -unsigned DefSubReg, -const TargetRegisterClass *SrcRC, -unsigned SrcSubReg) const override; - /// \returns True if operands defined with this operand type can accept /// a literal constant (i.e. any 32-bit immediate). bool opCanUseLiteralConstant(unsigned OpType) const; diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll index c6c0b9cf8f027f..cc2f775ff22bc5 100644 --- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll +++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll @@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr addrspace(1) %out0, ptr addrspa define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr addrspace(1) %in) nounwind { ; SI-LABEL: test_copy_v4i8_x4: ; SI: ; %bb.0: -; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11 -; SI-NEXT:s_mov_b32 s3, 0xf000 -; SI-NEXT:s_mov_b32 s10, 0 -; SI-NEXT:s_mov_b32 s11, s3 +; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11 +; SI-NEXT:s_mov_b32 s11, 0xf000 +; SI-NEXT:s_mov_b32 s2, 0 +; SI-NEXT:s_mov_b32 s3, s11 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; SI-NEXT:v_mov_b32_e32 v1, 0 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64 -; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9 -; SI-NEXT:s_mov_b32 s2, -1 -; SI-NEXT:s_mov_b32 s14, s2 -; SI-NEXT:s_mov_b32 s15, s3 -; SI-NEXT:s_mov_b32 s18, s2 +; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64 +; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9 +; SI-NEXT:s_mov_b32 s10, -1 +; SI-NEXT:s_mov_b32 s14, s10 +; SI-NEXT:s_mov_b32 s15, s11 +; SI-NEXT:s_mov_b32 s18, s10 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:s_mo
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/125535?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#125535** https://app.graphite.dev/github/pr/llvm/llvm-project/125535?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/125535?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#125533** https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125224** https://app.graphite.dev/github/pr/llvm/llvm-project/125224?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/125535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) Changes This was ultimately working around bugs in subregister handling in peephole-opt. In the common case, it would give up on folding anything into a subregister extract copy. --- Patch is 76.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125535.diff 9 Files Affected: - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (-24) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (-5) - (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+22-22) - (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+18-18) - (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+91-91) - (modified) llvm/test/CodeGen/AMDGPU/load-global-i32.ll (+42-43) - (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-fold-reg-sequence-subreg.mir (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+239-237) ``diff diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 6fc57dec6a8264..71c720ed09b5fb 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -3516,30 +3516,6 @@ bool SIRegisterInfo::opCanUseInlineConstant(unsigned OpType) const { OpType <= AMDGPU::OPERAND_SRC_LAST; } -bool SIRegisterInfo::shouldRewriteCopySrc( - const TargetRegisterClass *DefRC, - unsigned DefSubReg, - const TargetRegisterClass *SrcRC, - unsigned SrcSubReg) const { - // We want to prefer the smallest register class possible, so we don't want to - // stop and rewrite on anything that looks like a subregister - // extract. Operations mostly don't care about the super register class, so we - // only want to stop on the most basic of copies between the same register - // class. - // - // e.g. if we have something like - // %0 = ... - // %1 = ... - // %2 = REG_SEQUENCE %0, sub0, %1, sub1, %2, sub2 - // %3 = COPY %2, sub0 - // - // We want to look through the COPY to find: - // => %3 = COPY %0 - - // Plain copy. - return getCommonSubClass(DefRC, SrcRC) != nullptr; -} - bool SIRegisterInfo::opCanUseLiteralConstant(unsigned OpType) const { // TODO: 64-bit operands have extending behavior from 32-bit literal. return OpType >= AMDGPU::OPERAND_REG_IMM_FIRST && diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h index 8e481e3ac23043..a434efb70d0525 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h @@ -275,11 +275,6 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo { const TargetRegisterClass *SubRC, unsigned SubIdx) const; - bool shouldRewriteCopySrc(const TargetRegisterClass *DefRC, -unsigned DefSubReg, -const TargetRegisterClass *SrcRC, -unsigned SrcSubReg) const override; - /// \returns True if operands defined with this operand type can accept /// a literal constant (i.e. any 32-bit immediate). bool opCanUseLiteralConstant(unsigned OpType) const; diff --git a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll index c6c0b9cf8f027f..cc2f775ff22bc5 100644 --- a/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll +++ b/llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll @@ -163,33 +163,33 @@ define amdgpu_kernel void @test_copy_v4i8_x3(ptr addrspace(1) %out0, ptr addrspa define amdgpu_kernel void @test_copy_v4i8_x4(ptr addrspace(1) %out0, ptr addrspace(1) %out1, ptr addrspace(1) %out2, ptr addrspace(1) %out3, ptr addrspace(1) %in) nounwind { ; SI-LABEL: test_copy_v4i8_x4: ; SI: ; %bb.0: -; SI-NEXT:s_load_dwordx2 s[8:9], s[4:5], 0x11 -; SI-NEXT:s_mov_b32 s3, 0xf000 -; SI-NEXT:s_mov_b32 s10, 0 -; SI-NEXT:s_mov_b32 s11, s3 +; SI-NEXT:s_load_dwordx2 s[0:1], s[4:5], 0x11 +; SI-NEXT:s_mov_b32 s11, 0xf000 +; SI-NEXT:s_mov_b32 s2, 0 +; SI-NEXT:s_mov_b32 s3, s11 ; SI-NEXT:v_lshlrev_b32_e32 v0, 2, v0 ; SI-NEXT:v_mov_b32_e32 v1, 0 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:buffer_load_dword v0, v[0:1], s[8:11], 0 addr64 -; SI-NEXT:s_load_dwordx8 s[4:11], s[4:5], 0x9 -; SI-NEXT:s_mov_b32 s2, -1 -; SI-NEXT:s_mov_b32 s14, s2 -; SI-NEXT:s_mov_b32 s15, s3 -; SI-NEXT:s_mov_b32 s18, s2 +; SI-NEXT:buffer_load_dword v0, v[0:1], s[0:3], 0 addr64 +; SI-NEXT:s_load_dwordx8 s[0:7], s[4:5], 0x9 +; SI-NEXT:s_mov_b32 s10, -1 +; SI-NEXT:s_mov_b32 s14, s10 +; SI-NEXT:s_mov_b32 s15, s11 +; SI-NEXT:s_mov_b32 s18, s10 ; SI-NEXT:s_waitcnt lgkmcnt(0) -; SI-NEXT:s_mov_b32 s0, s4 -; SI-NEXT:s_mov_b32 s1, s5 -; SI-NEXT:s_mov_b32 s19, s3 -; SI-NEXT:s_mov_b32 s22, s2 -; SI-NEXT:s_mov_b32 s23, s3 -; SI-NEXT:
[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/125533 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)
llvmbot wrote: @llvm/pr-subscribers-llvm-regalloc Author: Matt Arsenault (arsenm) Changes This fixes the handling of subregister extract copies. This will allow AMDGPU to remove its implementation of shouldRewriteCopySrc, which exists as a 10 year old workaround to this bug. peephole-opt-fold-reg-sequence-subreg.mir will show the expected improvement once the custom implementation is removed. The copy coalescing processing here is overly abstracted from what's actually happening. Previously when visiting coalescable copy-like instructions, we would parse the sources one at a time and then pass the def of the root instruction into findNextSource. This means that the first thing the new ValueTracker constructed would do is getVRegDef to find the instruction we are currently processing. This adds an unnecessary step, placing a useless entry in the RewriteMap, and required skipping the no-op case where getNewSource would return the original source operand. This was a problem since in the case of a subregister extract, shouldRewriteCopySource would always say that it is useful to rewrite and the use-def chain walk would abort, returning the original operand. Move the process to start looking at the source operand to begin with. This does not fix the confused handling in the uncoalescable copy case which is proving to be more difficult. Some currently handled cases have multiple defs from a single source, and other handled cases have 0 input operands. It would be simpler if this was implemented with isCopyLikeInstr, rather than guessing at the operand structure as it does now. There are some improvements and some regressions. The regressions appear to be downstream issues for the most part. One of the uglier regressions is in PPC, where a sequence of insert_subrgs is used to build registers. I opened #125502 to use reg_sequence instead, which may help. The worst regression is an absurd SPARC testcase using a <251 x fp128>, which uses a very long chain of insert_subregs. We need improved subregister handling locally in PeepholeOptimizer, and other pasess like MachineCSE to fix some of the other regressions. We should handle subregister composes and folding more indexes into insert_subreg and reg_sequence. --- Patch is 475.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125533.diff 105 Files Affected: - (modified) llvm/lib/CodeGen/PeepholeOptimizer.cpp (+28-12) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc3.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8a.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-lse2.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc3.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-v8a.ll (+30-30) - (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll (-4) - (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll (+34-40) - (modified) llvm/test/CodeGen/AArch64/addsub_ext.ll (+4-22) - (modified) llvm/test/CodeGen/AArch64/and-mask-removal.ll (-1) - (modified) llvm/test/CodeGen/AArch64/arm64-ldxr-stxr.ll (-4) - (modified) llvm/test/CodeGen/AArch64/arm64-vaddv.ll (-1) - (modified) llvm/test/CodeGen/AArch64/arm64_32-addrs.ll (-1) - (modified) llvm/test/CodeGen/AArch64/atomic-ops-msvc.ll (+4-7) - (modified) llvm/test/CodeGen/AArch64/atomic-ops.ll (-4) - (modified) llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll (+10-12) - (modified) llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll (+10-12) - (modified) llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll (+10-12) - (modified) llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll (+10-12) - (modified) llvm/test/CodeGen/AArch64/atomicrmw-xchg-fp.ll (+5-5) - (modified) llvm/test/CodeGen/AArch64/cmp-to-cmn.ll (-8) - (modified) llvm/test/CodeGen/AArch64/cmpxchg-idioms.ll (-1) - (modified) llvm/test/CodeGen/AArch64/extract-bits.ll (-6) - (modified) llvm/test/CodeGen/AArch64/fold-int-pow2-with-fmul-or-fdiv.ll (-2) - (modified) llvm/test/CodeGen/AArch64/fsh.ll (-2) - (modified) llvm/test/CodeGen/AArch64/funnel-shift.ll (-4) - (modified) llvm/test/CodeGen/AArch64/hoist-and-by-const-from-lshr-in-eqcmp-zero.ll (-8) - (modified) llvm/test/CodeGen/AArch64/hoist-and-by-const-from-shl-in-eqcmp-zero.ll (+5-14) - (modified) llvm/test/CodeGen/AArch64/logic-shift.ll (-9) - (modified) llvm/test/CodeGen/AArch64/neon-insextbitcast.ll (-2) - (modified) llvm/test/CodeGen/AArch64/shift-by-signext.ll (-2) - (modified) llvm/test/CodeGen/AArch64/shift.ll (-6) - (modified) llvm/test/CodeGen/AArch64/sink-and-fold.ll (-1) - (modified) llvm/test/CodeGen/AArch64/sve-fixed-lengt
[llvm-branch-commits] [llvm] PeepholeOpt: Fix looking for def of current copy to coalesce (PR #125533)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#125533** https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/125533?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#125224** https://app.graphite.dev/github/pr/llvm/llvm-project/125224?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/125533 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Use default shouldRewriteCopySrc (PR #125535)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/125535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/125515 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/125515 Backport cb2598dda1aae5096a77bc8a9f6679ca1b350e5e Requested by: @brad0 >From acd0c6c1774c2c2ba97c714709041f6561370447 Mon Sep 17 00:00:00 2001 From: Brad Smith Date: Mon, 3 Feb 2025 10:03:59 -0500 Subject: [PATCH] [flang][runtime] Make sure to link libexecinfo if it exists (#125344) Fixes building the backtrace support on FreeBSD/NetBSD/OpenBSD/DragonFly and musl libc with libexecinfo. (cherry picked from commit cb2598dda1aae5096a77bc8a9f6679ca1b350e5e) --- flang/runtime/CMakeLists.txt | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/flang/runtime/CMakeLists.txt b/flang/runtime/CMakeLists.txt index fbfaae9a880648..bf27a121e4d174 100644 --- a/flang/runtime/CMakeLists.txt +++ b/flang/runtime/CMakeLists.txt @@ -59,10 +59,15 @@ if (CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR) ) endif() +set(linked_libraries FortranDecimal) + # function checks find_package(Backtrace) set(HAVE_BACKTRACE ${Backtrace_FOUND}) set(BACKTRACE_HEADER ${Backtrace_HEADER}) +if(HAVE_BACKTRACE) + list(APPEND linked_libraries ${Backtrace_LIBRARY}) +endif() include(CheckCXXSymbolExists) include(CheckCXXSourceCompiles) @@ -271,7 +276,7 @@ if (NOT DEFINED MSVC) add_flang_library(FortranRuntime ${sources} LINK_LIBS -FortranDecimal +${linked_libraries} INSTALL_WITH_TOOLCHAIN ) @@ -279,7 +284,7 @@ else() add_flang_library(FortranRuntime ${sources} LINK_LIBS -FortranDecimal +${linked_libraries} ) set(CMAKE_MSVC_RUNTIME_LIBRARY MultiThreaded) add_flang_library(FortranRuntime.static ${sources} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)
llvmbot wrote: @llvm/pr-subscribers-flang-runtime Author: None (llvmbot) Changes Backport cb2598dda1aae5096a77bc8a9f6679ca1b350e5e Requested by: @brad0 --- Full diff: https://github.com/llvm/llvm-project/pull/125515.diff 1 Files Affected: - (modified) flang/runtime/CMakeLists.txt (+7-2) ``diff diff --git a/flang/runtime/CMakeLists.txt b/flang/runtime/CMakeLists.txt index fbfaae9a8806486..bf27a121e4d174c 100644 --- a/flang/runtime/CMakeLists.txt +++ b/flang/runtime/CMakeLists.txt @@ -59,10 +59,15 @@ if (CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR) ) endif() +set(linked_libraries FortranDecimal) + # function checks find_package(Backtrace) set(HAVE_BACKTRACE ${Backtrace_FOUND}) set(BACKTRACE_HEADER ${Backtrace_HEADER}) +if(HAVE_BACKTRACE) + list(APPEND linked_libraries ${Backtrace_LIBRARY}) +endif() include(CheckCXXSymbolExists) include(CheckCXXSourceCompiles) @@ -271,7 +276,7 @@ if (NOT DEFINED MSVC) add_flang_library(FortranRuntime ${sources} LINK_LIBS -FortranDecimal +${linked_libraries} INSTALL_WITH_TOOLCHAIN ) @@ -279,7 +284,7 @@ else() add_flang_library(FortranRuntime ${sources} LINK_LIBS -FortranDecimal +${linked_libraries} ) set(CMAKE_MSVC_RUNTIME_LIBRARY MultiThreaded) add_flang_library(FortranRuntime.static ${sources} `` https://github.com/llvm/llvm-project/pull/125515 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/20.x: [flang][runtime] Make sure to link libexecinfo if it exists (#125344) (PR #125515)
llvmbot wrote: @tblah What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/125515 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [FMV][AArch64] Release notes for LLVM20. (PR #125525)
https://github.com/jroelofs approved this pull request. https://github.com/llvm/llvm-project/pull/125525 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [MC/DC] Enable nested expressions (PR #125413)
https://github.com/chapuni updated https://github.com/llvm/llvm-project/pull/125413 >From c56ecc30e9fd1a674073e362fbfcc6b43f2f52e2 Mon Sep 17 00:00:00 2001 From: NAKAMURA Takumi Date: Sun, 2 Feb 2025 22:06:32 +0900 Subject: [PATCH 1/2] [MC/DC] Enable nested expressions A warning "contains an operation with a nested boolean expression." is no longer emitter. At the moment, split expressions are treated as individual Decisions. --- clang/lib/CodeGen/CodeGenPGO.cpp | 150 ++ .../test/CoverageMapping/mcdc-nested-expr.cpp | 30 +++- .../Frontend/custom-diag-werror-interaction.c | 4 +- 3 files changed, 109 insertions(+), 75 deletions(-) diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp index 7d16f673ada419e..0c3973aa4dccfdc 100644 --- a/clang/lib/CodeGen/CodeGenPGO.cpp +++ b/clang/lib/CodeGen/CodeGenPGO.cpp @@ -228,10 +228,17 @@ struct MapRegionCounters : public RecursiveASTVisitor { /// The stacks are also used to find error cases and notify the user. A /// standard logical operator nest for a boolean expression could be in a form /// similar to this: "x = a && b && c && (d || f)" - unsigned NumCond = 0; - bool SplitNestedLogicalOp = false; - SmallVector NonLogOpStack; - SmallVector LogOpStack; + struct DecisionState { +llvm::DenseSet Leaves; // Not BinOp +const Expr *DecisionExpr;// Root +bool Split; + +DecisionState() = delete; +DecisionState(const Expr *E, bool Split = false) +: DecisionExpr(E), Split(Split) {} + }; + + SmallVector DecisionStack; // Hook: dataTraverseStmtPre() is invoked prior to visiting an AST Stmt node. bool dataTraverseStmtPre(Stmt *S) { @@ -239,34 +246,28 @@ struct MapRegionCounters : public RecursiveASTVisitor { if (MCDCMaxCond == 0) return true; -/// At the top of the logical operator nest, reset the number of conditions, -/// also forget previously seen split nesting cases. -if (LogOpStack.empty()) { - NumCond = 0; - SplitNestedLogicalOp = false; -} - -if (const Expr *E = dyn_cast(S)) { - const BinaryOperator *BinOp = dyn_cast(E->IgnoreParens()); - if (BinOp && BinOp->isLogicalOp()) { -/// Check for "split-nested" logical operators. This happens when a new -/// boolean expression logical-op nest is encountered within an existing -/// boolean expression, separated by a non-logical operator. For -/// example, in "x = (a && b && c && foo(d && f))", the "d && f" case -/// starts a new boolean expression that is separated from the other -/// conditions by the operator foo(). Split-nested cases are not -/// supported by MC/DC. -SplitNestedLogicalOp = SplitNestedLogicalOp || !NonLogOpStack.empty(); - -LogOpStack.push_back(BinOp); +/// Mark "in splitting" when a leaf is met. +if (!DecisionStack.empty()) { + auto &StackTop = DecisionStack.back(); + if (!StackTop.Split) { +if (StackTop.Leaves.contains(S)) { + assert(!StackTop.Split); + StackTop.Split = true; +} return true; } + + // Split + assert(StackTop.Split); + assert(!StackTop.Leaves.contains(S)); } -/// Keep track of non-logical operators. These are OK as long as we don't -/// encounter a new logical operator after seeing one. -if (!LogOpStack.empty()) - NonLogOpStack.push_back(S); +if (const auto *E = dyn_cast(S)) { + if (const auto *BinOp = + dyn_cast(CodeGenFunction::stripCond(E)); + BinOp && BinOp->isLogicalOp()) +DecisionStack.emplace_back(E); +} return true; } @@ -275,49 +276,57 @@ struct MapRegionCounters : public RecursiveASTVisitor { // an AST Stmt node. MC/DC will use it to to signal when the top of a // logical operation (boolean expression) nest is encountered. bool dataTraverseStmtPost(Stmt *S) { -/// If MC/DC is not enabled, MCDCMaxCond will be set to 0. Do nothing. -if (MCDCMaxCond == 0) +if (DecisionStack.empty()) return true; -if (const Expr *E = dyn_cast(S)) { - const BinaryOperator *BinOp = dyn_cast(E->IgnoreParens()); - if (BinOp && BinOp->isLogicalOp()) { -assert(LogOpStack.back() == BinOp); -LogOpStack.pop_back(); - -/// At the top of logical operator nest: -if (LogOpStack.empty()) { - /// Was the "split-nested" logical operator case encountered? - if (SplitNestedLogicalOp) { -unsigned DiagID = Diag.getCustomDiagID( -DiagnosticsEngine::Warning, -"unsupported MC/DC boolean expression; " -"contains an operation with a nested boolean expression. " -"Expression will not be covered"); -Diag.Report(S->getBeginLoc(), DiagID); -return true; - } - - /// Was the maximum number of conditions en
[llvm-branch-commits] [clang] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions (PR #125484)
https://github.com/chapuni created https://github.com/llvm/llvm-project/pull/125484 `-fmcdc-single-conditions` is `CC1Option` for now. This change discovers `isInstrumentedCondition(Cond)` on `DoStmt/ForStmt/IfStmt/WhleStmt/AbstractConditionalOperator` and add them into Decisions. An example of the report: ``` MC/DC Decision Region (mmm:nn) to (mmm:nn) Number of Conditions: 1 Condition C1 -->(mmm:nn) Executed MC/DC Test Vectors: C1Result 1 { F = F } 2 { T = T } C1-Pair: covered: (1,2) MC/DC Coverage for Expression: 100.00% ``` The Decision is covered only if both `true` and `false` are covered. Fixes #95336 >From af336315f37021ccc6d21059ecfe28a0f30248ff Mon Sep 17 00:00:00 2001 From: NAKAMURA Takumi Date: Mon, 3 Feb 2025 20:35:06 +0900 Subject: [PATCH] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions `-fmcdc-single-conditions` is `CC1Option` for now. This change discovers `isInstrumentedCondition(Cond)` on `DoStmt/ForStmt/IfStmt/WhleStmt/AbstractConditionalOperator` and add them into Decisions. An example of the report: ``` MC/DC Decision Region (mmm:nn) to (mmm:nn) Number of Conditions: 1 Condition C1 -->(mmm:nn) Executed MC/DC Test Vectors: C1Result 1 { F = F } 2 { T = T } C1-Pair: covered: (1,2) MC/DC Coverage for Expression: 100.00% ``` The Decision is covered only if both `true` and `false` are covered. Fixes #95336 --- clang/docs/ReleaseNotes.rst | 3 + clang/docs/SourceBasedCodeCoverage.rst| 4 + clang/include/clang/Basic/CodeGenOptions.def | 1 + clang/include/clang/Driver/Options.td | 4 + clang/lib/CodeGen/CGExpr.cpp | 32 +-- clang/lib/CodeGen/CodeGenFunction.h | 4 +- clang/lib/CodeGen/CodeGenPGO.cpp | 38 - clang/lib/CodeGen/CoverageMappingGen.cpp | 46 +++--- .../test/CoverageMapping/mcdc-single-cond.cpp | 85 ++- 9 files changed, 190 insertions(+), 27 deletions(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 42054fe27c5ee1c..4138fc2f11e0c17 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -118,6 +118,9 @@ Improvements to Coverage Mapping - [MC/DC] Nested expressions are handled as individual MC/DC expressions. +- [MC/DC] Non-boolean expressions on conditions can be included with + `-fmcdc-single-conditions`. (#GH95336) + Bug Fixes in This Version - diff --git a/clang/docs/SourceBasedCodeCoverage.rst b/clang/docs/SourceBasedCodeCoverage.rst index d26babe829ab5be..bcd4ae0e9748d15 100644 --- a/clang/docs/SourceBasedCodeCoverage.rst +++ b/clang/docs/SourceBasedCodeCoverage.rst @@ -510,6 +510,10 @@ requires 8 test vectors. Expressions such as ``((a0 && b0) || (a1 && b1) || ...)`` can cause the number of test vectors to increase exponentially. +Clang handles only binary logical operators as MC/DC coverage. Single +conditions without logcal operators on `do/for/while/if/?!` can be +included with `-Xclang -fmcdc-single-conditions`. + Switch statements - diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 259972bdf8f0013..1a9ebae845619b7 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -236,6 +236,7 @@ CODEGENOPT(DumpCoverageMapping , 1, 0) ///< Dump the generated coverage mapping CODEGENOPT(MCDCCoverage , 1, 0) ///< Enable MC/DC code coverage criteria. VALUE_CODEGENOPT(MCDCMaxConds, 16, 32767) ///< MC/DC Maximum conditions. VALUE_CODEGENOPT(MCDCMaxTVs, 32, 0x7FFE) ///< MC/DC Maximum test vectors. +VALUE_CODEGENOPT(MCDCSingleCond, 1, 0) ///< Enable MC/DC single conditions. /// If -fpcc-struct-return or -freg-struct-return is specified. ENUM_CODEGENOPT(StructReturnConvention, StructReturnConventionKind, 2, SRCK_Default) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 6eabd9f76a792db..57b826bce6da821 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1742,6 +1742,10 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, Group, Visibility<[CC1Option]>, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFE">; +def fmcdc_single_conditions : Flag<["-"], "fmcdc-single-conditions">, + Group, Visibility<[CC1Option]>, + HelpText<"Include also single conditions as MC/DC coverage">, + MarshallingInfoFlag>; def fprofile_generate : Flag<["-"], "fprofile-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 9676e61cf322d92..8
[llvm-branch-commits] [mlir] WIP: [mlir][OpenMP] Pack task private variables into a heap-allocated context struct (PR #125307)
https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/125307 >From afa9026eefb6c9cd613ed021a92e159f93c3667c Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Fri, 24 Jan 2025 17:32:41 + Subject: [PATCH 1/2] [mlir][OpenMP] Pack task private variables into a heap-allocated context struct See RFC: https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084 The aim here is to ensure that tasks which are not executed for a while after they are created do not try to reference any data which are now out of scope. This is done by packing the data referred to by the task into a heap allocated structure (freed at the end of the task). I decided to create the task context structure in OpenMPToLLVMIRTranslation instead of adapting how it is done CodeExtractor (via OpenMPIRBuilder] because CodeExtractor is (at least in theory) generic code which could have other unrelated uses. --- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 204 +++--- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 5 +- .../LLVMIR/openmp-task-privatization.mlir | 82 +++ 3 files changed, 254 insertions(+), 37 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/openmp-task-privatization.mlir diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 8a9a69cefad8ee1..5c4deab492c8390 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -13,6 +13,7 @@ #include "mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h" #include "mlir/Analysis/TopologicalSortUtils.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/LLVMIR/LLVMTypes.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Dialect/OpenMP/OpenMPInterfaces.h" #include "mlir/IR/IRMapping.h" @@ -24,10 +25,12 @@ #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/SetVector.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/ADT/TypeSwitch.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" #include "llvm/Frontend/OpenMP/OMPIRBuilder.h" #include "llvm/IR/DebugInfoMetadata.h" +#include "llvm/IR/DerivedTypes.h" #include "llvm/IR/IRBuilder.h" #include "llvm/IR/ReplaceConstant.h" #include "llvm/Support/FileSystem.h" @@ -1331,19 +1334,16 @@ findAssociatedValue(Value privateVar, llvm::IRBuilderBase &builder, /// Initialize a single (first)private variable. You probably want to use /// allocateAndInitPrivateVars instead of this. -static llvm::Error -initPrivateVar(llvm::IRBuilderBase &builder, - LLVM::ModuleTranslation &moduleTranslation, - omp::PrivateClauseOp &privDecl, Value mlirPrivVar, - BlockArgument &blockArg, llvm::Value *llvmPrivateVar, - llvm::SmallVectorImpl &llvmPrivateVars, - llvm::BasicBlock *privInitBlock, - llvm::DenseMap *mappedPrivateVars = nullptr) { +/// This returns the private variable which has been initialized. This +/// variable should be mapped before constructing the body of the Op. +static llvm::Expected initPrivateVar( +llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation, +omp::PrivateClauseOp &privDecl, Value mlirPrivVar, BlockArgument &blockArg, +llvm::Value *llvmPrivateVar, llvm::BasicBlock *privInitBlock, +llvm::DenseMap *mappedPrivateVars = nullptr) { Region &initRegion = privDecl.getInitRegion(); if (initRegion.empty()) { -moduleTranslation.mapValue(blockArg, llvmPrivateVar); -llvmPrivateVars.push_back(llvmPrivateVar); -return llvm::Error::success(); +return llvmPrivateVar; } // map initialization region block arguments @@ -1363,17 +1363,15 @@ initPrivateVar(llvm::IRBuilderBase &builder, assert(phis.size() == 1 && "expected one allocation to be yielded"); - // prefer the value yielded from the init region to the allocated private - // variable in case the region is operating on arguments by-value (e.g. - // Fortran character boxes). - moduleTranslation.mapValue(blockArg, phis[0]); - llvmPrivateVars.push_back(phis[0]); - // clear init region block argument mapping in case it needs to be // re-created with a different source for another use of the same // reduction decl moduleTranslation.forgetMapping(initRegion); - return llvm::Error::success(); + + // Prefer the value yielded from the init region to the allocated private + // variable in case the region is operating on arguments by-value (e.g. + // Fortran character boxes). + return phis[0]; } /// Allocate and initialize delayed private variables. Returns the basic block @@ -1415,11 +1413,13 @@ static llvm::Expected allocateAndInitPrivateVars( llvm::Value *llvmPrivateVar = builder.CreateAlloca( llvmAllocType, /*ArraySize=*/nullptr, "omp.private.alloc"); -
[llvm-branch-commits] [clang] [MC/DC] Introduce `-fmcdc-single-conditions` to include also single conditions (PR #125484)
llvmbot wrote: @llvm/pr-subscribers-clang Author: NAKAMURA Takumi (chapuni) Changes `-fmcdc-single-conditions` is `CC1Option` for now. This change discovers `isInstrumentedCondition(Cond)` on `DoStmt/ForStmt/IfStmt/WhleStmt/AbstractConditionalOperator` and add them into Decisions. An example of the report: ``` MC/DC Decision Region (mmm:nn) to (mmm:nn) Number of Conditions: 1 Condition C1 -->(mmm:nn) Executed MC/DC Test Vectors: C1Result 1 { F = F } 2 { T = T } C1-Pair: covered: (1,2) MC/DC Coverage for Expression: 100.00% ``` The Decision is covered only if both `true` and `false` are covered. Fixes #95336 --- Patch is 24.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125484.diff 9 Files Affected: - (modified) clang/docs/ReleaseNotes.rst (+3) - (modified) clang/docs/SourceBasedCodeCoverage.rst (+4) - (modified) clang/include/clang/Basic/CodeGenOptions.def (+1) - (modified) clang/include/clang/Driver/Options.td (+4) - (modified) clang/lib/CodeGen/CGExpr.cpp (+24-8) - (modified) clang/lib/CodeGen/CodeGenFunction.h (+2-2) - (modified) clang/lib/CodeGen/CodeGenPGO.cpp (+34-4) - (modified) clang/lib/CodeGen/CoverageMappingGen.cpp (+36-10) - (modified) clang/test/CoverageMapping/mcdc-single-cond.cpp (+82-3) ``diff diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 42054fe27c5ee1..4138fc2f11e0c1 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -118,6 +118,9 @@ Improvements to Coverage Mapping - [MC/DC] Nested expressions are handled as individual MC/DC expressions. +- [MC/DC] Non-boolean expressions on conditions can be included with + `-fmcdc-single-conditions`. (#GH95336) + Bug Fixes in This Version - diff --git a/clang/docs/SourceBasedCodeCoverage.rst b/clang/docs/SourceBasedCodeCoverage.rst index d26babe829ab5b..bcd4ae0e9748d1 100644 --- a/clang/docs/SourceBasedCodeCoverage.rst +++ b/clang/docs/SourceBasedCodeCoverage.rst @@ -510,6 +510,10 @@ requires 8 test vectors. Expressions such as ``((a0 && b0) || (a1 && b1) || ...)`` can cause the number of test vectors to increase exponentially. +Clang handles only binary logical operators as MC/DC coverage. Single +conditions without logcal operators on `do/for/while/if/?!` can be +included with `-Xclang -fmcdc-single-conditions`. + Switch statements - diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 259972bdf8f001..1a9ebae845619b 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -236,6 +236,7 @@ CODEGENOPT(DumpCoverageMapping , 1, 0) ///< Dump the generated coverage mapping CODEGENOPT(MCDCCoverage , 1, 0) ///< Enable MC/DC code coverage criteria. VALUE_CODEGENOPT(MCDCMaxConds, 16, 32767) ///< MC/DC Maximum conditions. VALUE_CODEGENOPT(MCDCMaxTVs, 32, 0x7FFE) ///< MC/DC Maximum test vectors. +VALUE_CODEGENOPT(MCDCSingleCond, 1, 0) ///< Enable MC/DC single conditions. /// If -fpcc-struct-return or -freg-struct-return is specified. ENUM_CODEGENOPT(StructReturnConvention, StructReturnConventionKind, 2, SRCK_Default) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 6eabd9f76a792d..57b826bce6da82 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1742,6 +1742,10 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, Group, Visibility<[CC1Option]>, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFE">; +def fmcdc_single_conditions : Flag<["-"], "fmcdc-single-conditions">, + Group, Visibility<[CC1Option]>, + HelpText<"Include also single conditions as MC/DC coverage">, + MarshallingInfoFlag>; def fprofile_generate : Flag<["-"], "fprofile-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 9676e61cf322d9..82a31cb3721473 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -196,20 +196,36 @@ RawAddress CodeGenFunction::CreateMemTempWithoutCast(QualType Ty, /// EvaluateExprAsBool - Perform the usual unary conversions on the specified /// expression and compare the result against zero, returning an Int1Ty value. llvm::Value *CodeGenFunction::EvaluateExprAsBool(const Expr *E) { + auto DecisionExpr = stripCond(E); + if (isMCDCDecisionExpr(DecisionExpr) && isInstrumentedCondition(DecisionExpr)) +maybeResetMCDCCondBitmap(DecisionExpr); + else +DecisionExpr = nullptr; + PGO.setCurrentStmt(E); + llvm::Value *Result; if (const MemberPointerType *MPT = E->getType()->
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
Meinersbur wrote: For reason I cannot add @rahulana-quic nor @tobiasgrosser as reviewers. https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SPARC][IAS] Add support for `setsw` pseudoinstruction (PR #125150)
https://github.com/koachan updated https://github.com/llvm/llvm-project/pull/125150 >From 259439304b31a8557db456d276a84849c7a37067 Mon Sep 17 00:00:00 2001 From: Koakuma Date: Mon, 3 Feb 2025 23:12:07 +0700 Subject: [PATCH] Incorporate feedback Created using spr 1.3.4 --- llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp b/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp index 879f2ed8849618..3e9fc31d7bfc22 100644 --- a/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp +++ b/llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp @@ -744,7 +744,7 @@ bool SparcAsmParser::expandSETSW(MCInst &Inst, SMLoc IDLoc, assert(MCRegOp.isReg()); assert(MCValOp.isImm() || MCValOp.isExpr()); - // the imm operand can be either an expression or an immediate. + // The imm operand can be either an expression or an immediate. bool IsImm = Inst.getOperand(1).isImm(); int64_t ImmValue = IsImm ? MCValOp.getImm() : 0; const MCExpr *ValExpr = IsImm ? MCConstantExpr::create(ImmValue, getContext()) @@ -777,7 +777,7 @@ bool SparcAsmParser::expandSETSW(MCInst &Inst, SMLoc IDLoc, IsSmallImm ? ValExpr : adjustPICRelocation(SparcMCExpr::VK_Sparc_LO, ValExpr); -// orrd, %lo(val), rd +// or rd, %lo(val), rd Instructions.push_back(MCInstBuilder(SP::ORri) .addReg(MCRegOp.getReg()) .addReg(PrevReg.getReg()) @@ -790,7 +790,7 @@ bool SparcAsmParser::expandSETSW(MCInst &Inst, SMLoc IDLoc, // Large negative or non-immediate expressions would need an sra. if (!IsImm || ImmValue < 0) { -// srard, %g0, rd +// sra rd, %g0, rd Instructions.push_back(MCInstBuilder(SP::SRArr) .addReg(MCRegOp.getReg()) .addReg(MCRegOp.getReg()) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DXIL] Add support for root signature flag element in DXContainer (PR #123147)
@@ -0,0 +1,157 @@ +//===- DXILRootSignature.cpp - DXIL Root Signature helper objects ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +/// +/// \file This file contains helper objects and APIs for working with DXIL +/// Root Signatures. +/// +//===--===// +#include "DXILRootSignature.h" +#include "DirectX.h" +#include "llvm/ADT/StringSwitch.h" +#include "llvm/ADT/Twine.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/Module.h" +#include + +using namespace llvm; +using namespace llvm::dxil; + +static bool reportError(Twine Message) { + report_fatal_error(Message, false); bogner wrote: I haven't had time to review this in detail yet, but one important note. We should not be using `report_fatal_error` for error handling here. This is essentially crashing the compiler and should be used *very* sparingly. If the errors can come from user input or from corrupt binary files, this type of error reporting will be a terrible user experience. I suspect it would be more appropriate to use `LLVMContext::diagnose` and the DiagnosticInfo machinery so that we can report issues back to the frontend. https://github.com/llvm/llvm-project/pull/123147 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits