[Lldb-commits] [clang] [clang-tools-extra] [compiler-rt] [flang] [libc] [lld] [lldb] [llvm] [mlir] [openmp] [llvm-project] Fix typo "seperate" (PR #95373)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/95373 None >From 6d326a96d2651f8836b29ff1e3edef022f41549e Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Thu, 13 Jun 2024 09:46:48 +0100 Subject: [PATCH] [llvm-project] Fix typo "seperate" --- clang-tools-extra/clangd/TidyProvider.cpp | 10 .../include/clang/Frontend/FrontendOptions.h | 2 +- .../include/clang/InstallAPI/DylibVerifier.h | 2 +- clang/lib/InstallAPI/Visitor.cpp | 2 +- clang/lib/Serialization/ASTWriterStmt.cpp | 2 +- compiler-rt/test/dfsan/custom.cpp | 2 +- .../Linux/ppc64/trivial-tls-pwr10.test| 2 +- .../FlangOmpReport/yaml_summarizer.py | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 10 flang/test/Driver/mllvm_vs_mmlir.f90 | 2 +- libc/src/__support/FPUtil/x86_64/FEnvImpl.h | 2 +- .../stdio/printf_core/float_hex_converter.h | 10 .../str_to_float_comparison_test.cpp | 2 +- lld/test/wasm/data-segments.ll| 2 +- .../lldb/Expression/DWARFExpressionList.h | 2 +- lldb/include/lldb/Target/MemoryTagManager.h | 2 +- .../NativeRegisterContextLinux_arm64.cpp | 2 +- lldb/test/API/CMakeLists.txt | 2 +- .../TestGdbRemoteMemoryTagging.py | 2 +- .../DW_AT_data_bit_offset-DW_OP_stack_value.s | 2 +- llvm/include/llvm/CodeGen/LiveRegUnits.h | 2 +- llvm/include/llvm/CodeGen/MIRFormatter.h | 2 +- llvm/include/llvm/MC/MCAsmInfo.h | 2 +- llvm/include/llvm/Support/raw_socket_stream.h | 2 +- llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.h | 2 +- .../CodeGen/AssignmentTrackingAnalysis.cpp| 6 ++--- .../SelectionDAG/SelectionDAGBuilder.cpp | 4 ++-- llvm/lib/FileCheck/FileCheck.cpp | 2 +- llvm/lib/IR/DebugInfo.cpp | 2 +- llvm/lib/MC/MCPseudoProbe.cpp | 2 +- llvm/lib/Support/VirtualFileSystem.cpp| 2 +- llvm/lib/Support/raw_socket_stream.cpp| 2 +- llvm/lib/Target/ARM/ARMISelLowering.cpp | 2 +- .../Target/RISCV/RISCVMachineFunctionInfo.h | 2 +- llvm/lib/TargetParser/RISCVISAInfo.cpp| 2 +- llvm/lib/TextAPI/Utils.cpp| 2 +- llvm/lib/Transforms/IPO/Attributor.cpp| 4 ++-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 2 +- .../Scalar/RewriteStatepointsForGC.cpp| 2 +- .../Transforms/Utils/LoopUnrollRuntime.cpp| 2 +- llvm/test/CodeGen/X86/AMX/amx-greedy-ra.ll| 2 +- llvm/test/CodeGen/X86/apx/shift-eflags.ll | 24 +-- .../X86/merge-consecutive-stores-nt.ll| 4 ++-- llvm/test/CodeGen/X86/shift-eflags.ll | 24 +-- .../InstSimplify/constant-fold-fp-denormal.ll | 2 +- .../LoopVectorize/LoongArch/defaults.ll | 2 +- .../LoopVectorize/RISCV/defaults.ll | 2 +- .../split-gep-or-as-add.ll| 2 +- llvm/test/Verifier/alloc-size-failedparse.ll | 2 +- llvm/test/tools/llvm-ar/windows-path.test | 2 +- .../ELF/mirror-permissions-win.test | 2 +- llvm/tools/llvm-cov/CodeCoverage.cpp | 2 +- llvm/tools/llvm-profgen/PerfReader.cpp| 2 +- llvm/unittests/Support/Path.cpp | 4 ++-- .../Analysis/Presburger/IntegerRelation.h | 2 +- .../Analysis/Presburger/PresburgerSpace.h | 2 +- .../mlir/Dialect/OpenMP/OpenMPInterfaces.h| 2 +- .../Analysis/Presburger/PresburgerSpace.cpp | 2 +- .../lib/Conversion/GPUCommon/GPUOpsLowering.h | 2 +- .../LLVMIR/IR/BasicPtxBuilderInterface.cpp| 2 +- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 6 ++--- .../CPU/sparse_reduce_custom_prod.mlir| 2 +- .../omptarget-constant-alloca-raise.mlir | 2 +- openmp/tools/Modules/FindOpenMPTarget.cmake | 2 +- 64 files changed, 106 insertions(+), 106 deletions(-) diff --git a/clang-tools-extra/clangd/TidyProvider.cpp b/clang-tools-extra/clangd/TidyProvider.cpp index a4121df30d3df..a87238e0c0938 100644 --- a/clang-tools-extra/clangd/TidyProvider.cpp +++ b/clang-tools-extra/clangd/TidyProvider.cpp @@ -195,10 +195,10 @@ TidyProvider addTidyChecks(llvm::StringRef Checks, } TidyProvider disableUnusableChecks(llvm::ArrayRef ExtraBadChecks) { - constexpr llvm::StringLiteral Seperator(","); + constexpr llvm::StringLiteral Separator(","); static const std::string BadChecks = llvm::join_items( - Seperator, - // We want this list to start with a seperator to + Separator, + // We want this list to start with a separator to // simplify appending in the lambda. So including an // empty string here will force that. "", @@ -227,7 +227,7 @@ TidyProvider disableUnusableChecks(llvm::ArrayRef ExtraBadChecks) { for (const std::string &Str : ExtraBadChecks) { if (Str.empty()) continue; -Size += Seperator.size(); +Size += Separator.size(); if (LLVM_LIKELY(Str.front()
[Lldb-commits] [clang] [clang-tools-extra] [compiler-rt] [flang] [libc] [lld] [lldb] [llvm] [mlir] [openmp] [llvm-project] Fix typo "seperate" (PR #95373)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/95373 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [llvm] [lldb] [mlir] [libcxx] [openmp] [flang] [libcxxabi] [compiler-rt] [clang] [AMDGPU] Add GFX12 encoding for VINTERP instructions (PR #74616)
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/74616 >From 69580e5f77514fecf0aabe2a80c98881f9bd7288 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 7 Feb 2023 16:27:27 + Subject: [PATCH 1/2] [AMDGPU] Add GFX12 encoding for VINTERP instructions --- .../Disassembler/AMDGPUDisassembler.cpp | 6 +- llvm/lib/Target/AMDGPU/VINTERPInstructions.td | 38 ++- llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s | 187 ++--- .../AMDGPU/gfx12_dasm_vinterp.txt | 251 ++ 4 files changed, 378 insertions(+), 104 deletions(-) create mode 100644 llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vinterp.txt diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp index 3175f6358a045..c37af739e2019 100644 --- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp +++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp @@ -782,9 +782,13 @@ DecodeStatus AMDGPUDisassembler::convertEXPInst(MCInst &MI) const { DecodeStatus AMDGPUDisassembler::convertVINTERPInst(MCInst &MI) const { if (MI.getOpcode() == AMDGPU::V_INTERP_P10_F16_F32_inreg_gfx11 || + MI.getOpcode() == AMDGPU::V_INTERP_P10_F16_F32_inreg_gfx12 || MI.getOpcode() == AMDGPU::V_INTERP_P10_RTZ_F16_F32_inreg_gfx11 || + MI.getOpcode() == AMDGPU::V_INTERP_P10_RTZ_F16_F32_inreg_gfx12 || MI.getOpcode() == AMDGPU::V_INTERP_P2_F16_F32_inreg_gfx11 || - MI.getOpcode() == AMDGPU::V_INTERP_P2_RTZ_F16_F32_inreg_gfx11) { + MI.getOpcode() == AMDGPU::V_INTERP_P2_F16_F32_inreg_gfx12 || + MI.getOpcode() == AMDGPU::V_INTERP_P2_RTZ_F16_F32_inreg_gfx11 || + MI.getOpcode() == AMDGPU::V_INTERP_P2_RTZ_F16_F32_inreg_gfx12) { // The MCInst has this field that is not directly encoded in the // instruction. insertNamedMCOperand(MI, MCOperand::createImm(0), AMDGPU::OpName::op_sel); diff --git a/llvm/lib/Target/AMDGPU/VINTERPInstructions.td b/llvm/lib/Target/AMDGPU/VINTERPInstructions.td index 7d03150bf5b11..fc563b7493adf 100644 --- a/llvm/lib/Target/AMDGPU/VINTERPInstructions.td +++ b/llvm/lib/Target/AMDGPU/VINTERPInstructions.td @@ -10,7 +10,7 @@ // VINTERP encoding //===--===// -class VINTERPe_gfx11 op, VOPProfile P> : Enc64 { +class VINTERPe : Enc64 { bits<8> vdst; bits<4> src0_modifiers; bits<9> src0; @@ -31,7 +31,6 @@ class VINTERPe_gfx11 op, VOPProfile P> : Enc64 { let Inst{13}= !if(P.HasOpSel, src2_modifiers{2}, 0); // op_sel(2) let Inst{14}= !if(P.HasOpSel, src0_modifiers{3}, 0); // op_sel(3) let Inst{15}= clamp; - let Inst{22-16} = op; let Inst{40-32} = src0; let Inst{49-41} = src1; let Inst{58-50} = src2; @@ -40,6 +39,14 @@ class VINTERPe_gfx11 op, VOPProfile P> : Enc64 { let Inst{63}= src2_modifiers{0}; // neg(2) } +class VINTERPe_gfx11 op, VOPProfile P> : VINTERPe { + let Inst{22-16} = op; +} + +class VINTERPe_gfx12 op, VOPProfile P> : VINTERPe { + let Inst{20-16} = op{4-0}; +} + //===--===// // VOP3 VINTERP //===--===// @@ -171,17 +178,28 @@ defm : VInterpF16Pat op> { +multiclass VINTERP_Real_gfx11 op> { + let AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11" in { def _gfx11 : VINTERP_Real(NAME), SIEncodingFamily.GFX11>, VINTERPe_gfx11(NAME).Pfl>; } } -defm V_INTERP_P10_F32_inreg : VINTERP_Real_gfx11<0x000>; -defm V_INTERP_P2_F32_inreg : VINTERP_Real_gfx11<0x001>; -defm V_INTERP_P10_F16_F32_inreg : VINTERP_Real_gfx11<0x002>; -defm V_INTERP_P2_F16_F32_inreg : VINTERP_Real_gfx11<0x003>; -defm V_INTERP_P10_RTZ_F16_F32_inreg : VINTERP_Real_gfx11<0x004>; -defm V_INTERP_P2_RTZ_F16_F32_inreg : VINTERP_Real_gfx11<0x005>; +multiclass VINTERP_Real_gfx12 op> { + let AssemblerPredicate = isGFX12Only, DecoderNamespace = "GFX12" in { +def _gfx12 : + VINTERP_Real(NAME), SIEncodingFamily.GFX12>, + VINTERPe_gfx12(NAME).Pfl>; + } +} + +multiclass VINTERP_Real_gfx11_gfx12 op> : + VINTERP_Real_gfx11, VINTERP_Real_gfx12; + +defm V_INTERP_P10_F32_inreg : VINTERP_Real_gfx11_gfx12<0x000>; +defm V_INTERP_P2_F32_inreg : VINTERP_Real_gfx11_gfx12<0x001>; +defm V_INTERP_P10_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x002>; +defm V_INTERP_P2_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x003>; +defm V_INTERP_P10_RTZ_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x004>; +defm V_INTERP_P2_RTZ_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x005>; diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s b/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s index e2e53776783f3..fdfbf65c0e3cf 100644 --- a/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s +++ b/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s @@ -1,277 +1,278 @@ -// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck -check-p
[Lldb-commits] [flang] [clang] [libcxxabi] [lld] [lldb] [mlir] [llvm] [clang-tools-extra] [openmp] [compiler-rt] [libcxx] [AMDGPU] Add GFX12 encoding for VINTERP instructions (PR #74616)
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/74616 >From 69580e5f77514fecf0aabe2a80c98881f9bd7288 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 7 Feb 2023 16:27:27 + Subject: [PATCH 1/2] [AMDGPU] Add GFX12 encoding for VINTERP instructions --- .../Disassembler/AMDGPUDisassembler.cpp | 6 +- llvm/lib/Target/AMDGPU/VINTERPInstructions.td | 38 ++- llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s | 187 ++--- .../AMDGPU/gfx12_dasm_vinterp.txt | 251 ++ 4 files changed, 378 insertions(+), 104 deletions(-) create mode 100644 llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vinterp.txt diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp index 3175f6358a045..c37af739e2019 100644 --- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp +++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp @@ -782,9 +782,13 @@ DecodeStatus AMDGPUDisassembler::convertEXPInst(MCInst &MI) const { DecodeStatus AMDGPUDisassembler::convertVINTERPInst(MCInst &MI) const { if (MI.getOpcode() == AMDGPU::V_INTERP_P10_F16_F32_inreg_gfx11 || + MI.getOpcode() == AMDGPU::V_INTERP_P10_F16_F32_inreg_gfx12 || MI.getOpcode() == AMDGPU::V_INTERP_P10_RTZ_F16_F32_inreg_gfx11 || + MI.getOpcode() == AMDGPU::V_INTERP_P10_RTZ_F16_F32_inreg_gfx12 || MI.getOpcode() == AMDGPU::V_INTERP_P2_F16_F32_inreg_gfx11 || - MI.getOpcode() == AMDGPU::V_INTERP_P2_RTZ_F16_F32_inreg_gfx11) { + MI.getOpcode() == AMDGPU::V_INTERP_P2_F16_F32_inreg_gfx12 || + MI.getOpcode() == AMDGPU::V_INTERP_P2_RTZ_F16_F32_inreg_gfx11 || + MI.getOpcode() == AMDGPU::V_INTERP_P2_RTZ_F16_F32_inreg_gfx12) { // The MCInst has this field that is not directly encoded in the // instruction. insertNamedMCOperand(MI, MCOperand::createImm(0), AMDGPU::OpName::op_sel); diff --git a/llvm/lib/Target/AMDGPU/VINTERPInstructions.td b/llvm/lib/Target/AMDGPU/VINTERPInstructions.td index 7d03150bf5b11..fc563b7493adf 100644 --- a/llvm/lib/Target/AMDGPU/VINTERPInstructions.td +++ b/llvm/lib/Target/AMDGPU/VINTERPInstructions.td @@ -10,7 +10,7 @@ // VINTERP encoding //===--===// -class VINTERPe_gfx11 op, VOPProfile P> : Enc64 { +class VINTERPe : Enc64 { bits<8> vdst; bits<4> src0_modifiers; bits<9> src0; @@ -31,7 +31,6 @@ class VINTERPe_gfx11 op, VOPProfile P> : Enc64 { let Inst{13}= !if(P.HasOpSel, src2_modifiers{2}, 0); // op_sel(2) let Inst{14}= !if(P.HasOpSel, src0_modifiers{3}, 0); // op_sel(3) let Inst{15}= clamp; - let Inst{22-16} = op; let Inst{40-32} = src0; let Inst{49-41} = src1; let Inst{58-50} = src2; @@ -40,6 +39,14 @@ class VINTERPe_gfx11 op, VOPProfile P> : Enc64 { let Inst{63}= src2_modifiers{0}; // neg(2) } +class VINTERPe_gfx11 op, VOPProfile P> : VINTERPe { + let Inst{22-16} = op; +} + +class VINTERPe_gfx12 op, VOPProfile P> : VINTERPe { + let Inst{20-16} = op{4-0}; +} + //===--===// // VOP3 VINTERP //===--===// @@ -171,17 +178,28 @@ defm : VInterpF16Pat op> { +multiclass VINTERP_Real_gfx11 op> { + let AssemblerPredicate = isGFX11Only, DecoderNamespace = "GFX11" in { def _gfx11 : VINTERP_Real(NAME), SIEncodingFamily.GFX11>, VINTERPe_gfx11(NAME).Pfl>; } } -defm V_INTERP_P10_F32_inreg : VINTERP_Real_gfx11<0x000>; -defm V_INTERP_P2_F32_inreg : VINTERP_Real_gfx11<0x001>; -defm V_INTERP_P10_F16_F32_inreg : VINTERP_Real_gfx11<0x002>; -defm V_INTERP_P2_F16_F32_inreg : VINTERP_Real_gfx11<0x003>; -defm V_INTERP_P10_RTZ_F16_F32_inreg : VINTERP_Real_gfx11<0x004>; -defm V_INTERP_P2_RTZ_F16_F32_inreg : VINTERP_Real_gfx11<0x005>; +multiclass VINTERP_Real_gfx12 op> { + let AssemblerPredicate = isGFX12Only, DecoderNamespace = "GFX12" in { +def _gfx12 : + VINTERP_Real(NAME), SIEncodingFamily.GFX12>, + VINTERPe_gfx12(NAME).Pfl>; + } +} + +multiclass VINTERP_Real_gfx11_gfx12 op> : + VINTERP_Real_gfx11, VINTERP_Real_gfx12; + +defm V_INTERP_P10_F32_inreg : VINTERP_Real_gfx11_gfx12<0x000>; +defm V_INTERP_P2_F32_inreg : VINTERP_Real_gfx11_gfx12<0x001>; +defm V_INTERP_P10_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x002>; +defm V_INTERP_P2_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x003>; +defm V_INTERP_P10_RTZ_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x004>; +defm V_INTERP_P2_RTZ_F16_F32_inreg : VINTERP_Real_gfx11_gfx12<0x005>; diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s b/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s index e2e53776783f3..fdfbf65c0e3cf 100644 --- a/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s +++ b/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s @@ -1,277 +1,278 @@ -// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck -check-p
[Lldb-commits] [lld] [mlir] [clang-tools-extra] [libcxxabi] [lldb] [flang] [compiler-rt] [openmp] [libcxx] [clang] [llvm] [AMDGPU] Add GFX12 encoding for VINTERP instructions (PR #74616)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/74616 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [llvm] [libcxxabi] [clang-tools-extra] [lldb] [clang] [lld] [compiler-rt] [flang] [libcxx] [AMDGPU] GFX12: select @llvm.prefetch intrinsic (PR #74576)
@@ -959,6 +967,32 @@ def : GCNPat < } } // let OtherPredicates = [HasShaderCyclesRegister] +def SIMM24bitPtr : ImmLeaf (Imm);}] +>; + +multiclass SMPrefetchPat { + def : GCNPat < +(smrd_prefetch (SMRDImm i64:$sbase, i32:$offset), timm, timm, (i32 cache_type)), +(!cast("S_PREFETCH_"#type) $sbase, $offset, (i32 SGPR_NULL), (i8 0)) + >; + + def : GCNPat < +(smrd_prefetch (i64 SReg_64:$sbase), timm, timm, (i32 cache_type)), +(!cast("S_PREFETCH_"#type) $sbase, 0, (i32 SGPR_NULL), (i8 0)) + >; + + def : GCNPat < +(prefetch SIMM24bitPtr:$offset, timm, timm, (i32 cache_type)), +(!cast("S_PREFETCH_"#type#"_PC_REL") (as_i32timm $offset), (i32 SGPR_NULL), (i8 0)) + > { +let AddedComplexity = 10; + } jayfoad wrote: But that is how `llvm.prefetch` is defined: "`address` is the address to be prefetched". A different operation should use a different intrinsic. https://github.com/llvm/llvm-project/pull/74576 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [compiler-rt] [flang] [lldb] [lld] [clang] [llvm] [libcxxabi] [libcxx] [clang-tools-extra] [AMDGPU] GFX12: select @llvm.prefetch intrinsic (PR #74576)
@@ -959,6 +967,32 @@ def : GCNPat < } } // let OtherPredicates = [HasShaderCyclesRegister] +def SIMM24bitPtr : ImmLeaf (Imm);}] +>; + +multiclass SMPrefetchPat { + def : GCNPat < +(smrd_prefetch (SMRDImm i64:$sbase, i32:$offset), timm, timm, (i32 cache_type)), +(!cast("S_PREFETCH_"#type) $sbase, $offset, (i32 SGPR_NULL), (i8 0)) + >; + + def : GCNPat < +(smrd_prefetch (i64 SReg_64:$sbase), timm, timm, (i32 cache_type)), +(!cast("S_PREFETCH_"#type) $sbase, 0, (i32 SGPR_NULL), (i8 0)) + >; + + def : GCNPat < +(prefetch SIMM24bitPtr:$offset, timm, timm, (i32 cache_type)), +(!cast("S_PREFETCH_"#type#"_PC_REL") (as_i32timm $offset), (i32 SGPR_NULL), (i8 0)) + > { +let AddedComplexity = 10; + } jayfoad wrote: I really don't know. What would the use cases look like? Maybe it could be a generic intrinsic, if there is consensus that it is useful. For the existing llvm.prefetch intrinsic, the only useful case I think of for instruction prefetching is: ``` define @f0() { call @llvm.prefetch(@f1, ...) ... call @f1() } define @f1() { ... } ``` to prefetch the code at the start of a function you are going to call. We could codegen that case using the _pc_rel form of the instruction. https://github.com/llvm/llvm-project/pull/74576 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lld] [clang] [compiler-rt] [lldb] [libcxx] [flang] [libc] [clang-tools-extra] [llvm] [GlobalISel] Add G_PREFETCH (PR #74863)
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/74863 >From e406c734609d3cd1ae436084c42c1c63d8af2795 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Fri, 8 Dec 2023 14:08:09 + Subject: [PATCH 1/2] [GlobalISel] Add G_PREFETCH --- .../CodeGen/GlobalISel/MachineIRBuilder.h | 4 ++ llvm/include/llvm/Support/TargetOpcodes.def | 3 + llvm/include/llvm/Target/GenericOpcodes.td| 9 +++ llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp | 12 .../CodeGen/GlobalISel/MachineIRBuilder.cpp | 10 +++ llvm/lib/CodeGen/MachineVerifier.cpp | 23 +++ llvm/lib/IR/Verifier.cpp | 2 +- llvm/lib/Target/AArch64/AArch64InstrGISel.td | 4 +- .../AArch64/GISel/AArch64LegalizerInfo.cpp| 55 .../AArch64/GISel/AArch64LegalizerInfo.h | 1 + .../GlobalISel/legalizer-info-validation.mir | 3 + llvm/test/MachineVerifier/test_g_prefetch.mir | 40 .../builtins/match-table-replacerreg.td | 20 +++--- .../match-table-imms.td | 28 - .../match-table-patfrag-root.td | 2 +- .../GlobalISelCombinerEmitter/match-table.td | 62 +-- llvm/test/TableGen/GlobalISelEmitter.td | 2 +- 17 files changed, 195 insertions(+), 85 deletions(-) create mode 100644 llvm/test/MachineVerifier/test_g_prefetch.mir diff --git a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h index 3d36d06a7e9da..eb846acde3e04 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h @@ -1529,6 +1529,10 @@ class MachineIRBuilder { /// Build and insert `G_FENCE Ordering, Scope`. MachineInstrBuilder buildFence(unsigned Ordering, unsigned Scope); + /// Build and insert G_PREFETCH \p Addr, \p RW, \p Locality, \p CacheType + MachineInstrBuilder buildPrefetch(const SrcOp &Addr, unsigned RW, +unsigned Locality, unsigned CacheType); + /// Build and insert \p Dst = G_FREEZE \p Src MachineInstrBuilder buildFreeze(const DstOp &Dst, const SrcOp &Src) { return buildInstr(TargetOpcode::G_FREEZE, {Dst}, {Src}); diff --git a/llvm/include/llvm/Support/TargetOpcodes.def b/llvm/include/llvm/Support/TargetOpcodes.def index 941c6d5f8cad8..91d9eb745a48f 100644 --- a/llvm/include/llvm/Support/TargetOpcodes.def +++ b/llvm/include/llvm/Support/TargetOpcodes.def @@ -415,6 +415,9 @@ HANDLE_TARGET_OPCODE_MARKER(GENERIC_ATOMICRMW_OP_END, G_ATOMICRMW_UDEC_WRAP) // Generic atomic fence HANDLE_TARGET_OPCODE(G_FENCE) +/// Generic prefetch +HANDLE_TARGET_OPCODE(G_PREFETCH) + /// Generic conditional branch instruction. HANDLE_TARGET_OPCODE(G_BRCOND) diff --git a/llvm/include/llvm/Target/GenericOpcodes.td b/llvm/include/llvm/Target/GenericOpcodes.td index 9a9c09d3c20d6..73e38b15bf671 100644 --- a/llvm/include/llvm/Target/GenericOpcodes.td +++ b/llvm/include/llvm/Target/GenericOpcodes.td @@ -1209,6 +1209,15 @@ def G_FENCE : GenericInstruction { let hasSideEffects = true; } +// Generic opcode equivalent to the llvm.prefetch intrinsic. +def G_PREFETCH : GenericInstruction { + let OutOperandList = (outs); + let InOperandList = (ins ptype0:$address, i32imm:$rw, i32imm:$locality, i32imm:$cachetype); + let hasSideEffects = true; + let mayLoad = true; + let mayStore = true; +} + //-- // Variadic ops //-- diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp index 14a4e72152e7c..b2850846bde67 100644 --- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp +++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp @@ -2435,6 +2435,18 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID, MIRBuilder.buildInstr(TargetOpcode::G_RESET_FPMODE, {}, {}); return true; } + case Intrinsic::prefetch: { +Value *Addr = CI.getOperand(0); +ConstantInt *RW = cast(CI.getOperand(1)); +ConstantInt *Locality = cast(CI.getOperand(2)); +ConstantInt *CacheType = cast(CI.getOperand(3)); + +MIRBuilder.buildPrefetch(getOrCreateVReg(*Addr), RW->getZExtValue(), + Locality->getZExtValue(), + CacheType->getZExtValue()); + +return true; + } #define INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC) \ case Intrinsic::INTRINSIC: #include "llvm/IR/ConstrainedOps.def" diff --git a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp index 80e9c08e850b6..f7febc9357c11 100644 --- a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp +++ b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp @@ -1051,6 +1051,16 @@ MachineIRBuilder::buildFence(unsigned Ordering, unsigned Scope) { .addImm(
[Lldb-commits] [llvm] [flang] [clang] [lld] [clang-tools-extra] [libcxx] [lldb] [libc] [compiler-rt] [GlobalISel] Add G_PREFETCH (PR #74863)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/74863 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang-tools-extra] [llvm] [lldb] [libcxx] [compiler-rt] [libc] [flang] [clang] [lld] [AMDGPU] Use alias info to relax waitcounts for LDS DMA (PR #74537)
@@ -0,0 +1,154 @@ +; RUN: llc -march=amdgcn -mcpu=gfx900 < %s | FileCheck %s --check-prefixeses=GCN,GFX9 +; RUN: llc -march=amdgcn -mcpu=gfx1030 < %s | FileCheck %s --check-prefixeses=GCN,GFX10 jayfoad wrote: > --check-prefixeses That's what happens when you enable `M-x gollum-mode` in Emacs. https://github.com/llvm/llvm-project/pull/74537 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lld] [compiler-rt] [libc] [clang] [libcxx] [lldb] [flang] [mlir] [llvm] [clang-tools-extra] [AMDGPU] GFX12: Add Split Workgroup Barrier (PR #74836)
@@ -684,6 +684,51 @@ s_rndne_f16 s5, 0xfe0b s_rndne_f16 s5, 0x3456 // GFX12: encoding: [0xff,0x6e,0x85,0xbe,0x56,0x34,0x00,0x00] +s_barrier_signal -2 jayfoad wrote: Missing `s_get_barrier_state` tests in this file? https://github.com/llvm/llvm-project/pull/74836 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [llvm] [flang] [compiler-rt] [lld] [libcxx] [clang] [libcxxabi] [clang-tools-extra] [lldb] [AMDGPU] GFX12: select @llvm.prefetch intrinsic (PR #74576)
@@ -3164,6 +3164,18 @@ def : GCNPat < (as_i1timm $bound_ctrl)) >; +class SMPrefetchGetPcPat : GCNPat < jayfoad wrote: This pattern also interprets the "address" argument as being an offset from PC, so it should also be removed from this version of the patch. https://github.com/llvm/llvm-project/pull/74576 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [llvm] [libcxx] [lldb] [clang] [lld] [flang] [compiler-rt] [clang-tools-extra] [libc] [AMDGPU] Use alias info to relax waitcounts for LDS DMA (PR #74537)
jayfoad wrote: How does this work in a case like this? ``` call void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr addrspace(3) @lds.3, i32 4, i32 0, i32 0, i32 0, i32 0) call void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr addrspace(3) %ptr, i32 4, i32 0, i32 0, i32 0, i32 0) %val.3 = load float, ptr addrspace(3) @lds.3, align 4 ``` i.e. - store to known lds address `@lds.3` (this will use slot 0 and another slot e.g. slot 3?) - store to unknown lds address (this will use slot 0?) - load from known lds address `@lds.3` (this will use slot 3?) https://github.com/llvm/llvm-project/pull/74537 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [flang] [clang-tools-extra] [lld] [llvm] [compiler-rt] [lldb] [libc] [libcxx] [clang] [AMDGPU] Use alias info to relax waitcounts for LDS DMA (PR #74537)
jayfoad wrote: > > How does this work in a case like this? > > ``` > > call void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr > > addrspace(3) @lds.3, i32 4, i32 0, i32 0, i32 0, i32 0) > > call void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr > > addrspace(3) %ptr, i32 4, i32 0, i32 0, i32 0, i32 0) > > %val.3 = load float, ptr addrspace(3) @lds.3, align 4 > > ``` > > > > > > > > > > > > > > > > > > > > > > > > i.e. > > ``` > > * store to known lds address `@lds.3` (this will use slot 0 and another > > slot e.g. slot 3?) > > > > * store to unknown lds address (this will use slot 0?) > > > > * load from known lds address `@lds.3` (this will use slot 3?) > > ``` > > It does not know the pointer, so it uses default slot 0 and waits till 0. Test case: ``` @lds.0 = internal addrspace(3) global [64 x float] poison, align 16 @lds.1 = internal addrspace(3) global [64 x float] poison, align 16 declare void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr addrspace(3) nocapture, i32 %size, i32 %voffset, i32 %soffset, i32 %offset, i32 %aux) define amdgpu_kernel void @f(<4 x i32> %rsrc, i32 %i1, i32 %i2, ptr addrspace(1) %out, ptr addrspace(3) %ptr) { main_body: call void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr addrspace(3) @lds.0, i32 4, i32 0, i32 0, i32 0, i32 0) call void @llvm.amdgcn.raw.buffer.load.lds(<4 x i32> %rsrc, ptr addrspace(3) %ptr, i32 4, i32 0, i32 0, i32 0, i32 0) %gep.0 = getelementptr float, ptr addrspace(3) @lds.0, i32 %i1 %gep.1 = getelementptr float, ptr addrspace(3) @lds.1, i32 %i2 %val.0 = load volatile float, ptr addrspace(3) %gep.0, align 4 %val.1 = load volatile float, ptr addrspace(3) %gep.1, align 4 %out.gep.1 = getelementptr float, ptr addrspace(1) %out, i32 1 store float %val.0, ptr addrspace(1) %out store float %val.1, ptr addrspace(1) %out.gep.1 ret void } ``` Generates: ``` s_load_dwordx8 s[4:11], s[0:1], 0x24 s_load_dword s2, s[0:1], 0x44 s_mov_b32 m0, 0 v_mov_b32_e32 v2, 0 s_waitcnt lgkmcnt(0) buffer_load_dword off, s[4:7], 0 lds s_mov_b32 m0, s2 s_lshl_b32 s0, s8, 2 buffer_load_dword off, s[4:7], 0 lds s_lshl_b32 s1, s9, 2 v_mov_b32_e32 v0, s0 v_mov_b32_e32 v1, s1 s_waitcnt vmcnt(1) ds_read_b32 v0, v0 s_waitcnt vmcnt(0) ds_read_b32 v1, v1 offset:256 s_waitcnt lgkmcnt(0) global_store_dwordx2 v2, v[0:1], s[10:11] s_endpgm ``` The `s_waitcnt vmcnt(1)` seems incorrect, because the second buffer-load-to-lds might clobber `@lds.0`. > I have to tell anyone interested here: before I even wrote this code it > didn't know of the dependency and did not wait for anything at all. Everyone > was happy. I am still happy, because buffer/flat/global-load-to-lds was removed in GFX11. https://github.com/llvm/llvm-project/pull/74537 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [llvm] [mlir] [openmp] [libc] [flang] [clang] [AMDGPU] GFX12 global_atomic_ordered_add_b64 instruction and intrinsic (PR #76149)
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/76149 >From b14a554a15e4de88c9afc428f9c6898090e6eb23 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Thu, 21 Dec 2023 12:00:26 + Subject: [PATCH] [AMDGPU] GFX12 global_atomic_ordered_add_b64 instruction and intrinsic --- llvm/include/llvm/IR/IntrinsicsAMDGPU.td | 10 ++- llvm/lib/Target/AMDGPU/AMDGPUInstructions.td | 1 + .../Target/AMDGPU/AMDGPURegisterBankInfo.cpp | 1 + .../Target/AMDGPU/AMDGPUSearchableTables.td | 1 + llvm/lib/Target/AMDGPU/FLATInstructions.td| 11 +++- llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 1 + ...vm.amdgcn.global.atomic.ordered.add.b64.ll | 65 +++ llvm/test/MC/AMDGPU/gfx11_unsupported.s | 3 + llvm/test/MC/AMDGPU/gfx12_asm_vflat.s | 24 +++ .../Disassembler/AMDGPU/gfx12_dasm_vflat.txt | 12 10 files changed, 124 insertions(+), 5 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.global.atomic.ordered.add.b64.ll diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 51bd9b63c127ed..3985c8871e1615 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -10,6 +10,8 @@ // //===--===// +def global_ptr_ty : LLVMQualPointerType<1>; + class AMDGPUReadPreloadRegisterIntrinsic : DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>; @@ -2353,10 +2355,10 @@ def int_amdgcn_s_get_waveid_in_workgroup : Intrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrHasSideEffects, IntrWillReturn, IntrNoCallback, IntrNoFree]>; -class AMDGPUGlobalAtomicRtn : Intrinsic < +class AMDGPUGlobalAtomicRtn : Intrinsic < [vt], - [llvm_anyptr_ty,// vaddr - vt], // vdata(VGPR) + [pt, // vaddr + vt], // vdata(VGPR) [IntrArgMemOnly, IntrWillReturn, NoCapture>, IntrNoCallback, IntrNoFree], "", [SDNPMemOperand]>; @@ -2486,6 +2488,8 @@ def int_amdgcn_permlanex16_var : ClangBuiltin<"__builtin_amdgcn_permlanex16_var" [IntrNoMem, IntrConvergent, IntrWillReturn, ImmArg>, ImmArg>, IntrNoCallback, IntrNoFree]>; +def int_amdgcn_global_atomic_ordered_add_b64 : AMDGPUGlobalAtomicRtn; + def int_amdgcn_flat_atomic_fmin_num : AMDGPUGlobalAtomicRtn; def int_amdgcn_flat_atomic_fmax_num : AMDGPUGlobalAtomicRtn; def int_amdgcn_global_atomic_fmin_num : AMDGPUGlobalAtomicRtn; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td index eaf72d7157ee2d..36e07d944c942c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td +++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td @@ -642,6 +642,7 @@ defm int_amdgcn_global_atomic_fmax : noret_op; defm int_amdgcn_global_atomic_csub : noret_op; defm int_amdgcn_flat_atomic_fadd : local_addr_space_atomic_op; defm int_amdgcn_ds_fadd_v2bf16 : noret_op; +defm int_amdgcn_global_atomic_ordered_add_b64 : noret_op; defm int_amdgcn_flat_atomic_fmin_num : noret_op; defm int_amdgcn_flat_atomic_fmax_num : noret_op; defm int_amdgcn_global_atomic_fmin_num : noret_op; diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp index c9412f720c62ec..fba060464a6e74 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp @@ -4690,6 +4690,7 @@ AMDGPURegisterBankInfo::getInstrMapping(const MachineInstr &MI) const { case Intrinsic::amdgcn_flat_atomic_fmax_num: case Intrinsic::amdgcn_global_atomic_fadd_v2bf16: case Intrinsic::amdgcn_flat_atomic_fadd_v2bf16: +case Intrinsic::amdgcn_global_atomic_ordered_add_b64: return getDefaultMappingAllVGPR(MI); case Intrinsic::amdgcn_ds_ordered_add: case Intrinsic::amdgcn_ds_ordered_swap: diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td b/llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td index beb670669581f1..4cc8871a00fe1f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td +++ b/llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td @@ -243,6 +243,7 @@ def : SourceOfDivergence; def : SourceOfDivergence; def : SourceOfDivergence; def : SourceOfDivergence; +def : SourceOfDivergence; def : SourceOfDivergence; def : SourceOfDivergence; def : SourceOfDivergence; diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td b/llvm/lib/Target/AMDGPU/FLATInstructions.td index 0dd2b3f5c2c912..615f8cd54d8f9c 100644 --- a/llvm/lib/Target/AMDGPU/FLATInstructions.td +++ b/llvm/lib/Target/AMDGPU/FLATInstructions.td @@ -926,9 +926,11 @@ defm GLOBAL_LOAD_LDS_USHORT : FLAT_Global_Load_LDS_Pseudo <"global_load_lds_usho defm GLOBAL_LOAD_LDS_SSHORT : FLAT_Global_Load_LDS_Pseudo <"global_load_lds_sshort">; defm GLOBAL_LOAD_LDS_DWORD : FLAT_Global_Load_LDS_Pseudo <"global_load_lds_dword">; -} // End is_flat_global = 1 - +let Subt
[Lldb-commits] [clang] [openmp] [flang] [lldb] [libc] [mlir] [llvm] [AMDGPU] GFX12 global_atomic_ordered_add_b64 instruction and intrinsic (PR #76149)
jayfoad wrote: Ping! https://github.com/llvm/llvm-project/pull/76149 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [openmp] [clang] [libc] [mlir] [lldb] [flang] [llvm] [AMDGPU] GFX12 global_atomic_ordered_add_b64 instruction and intrinsic (PR #76149)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/76149 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang-tools-extra] [libclc] [lld] [flang] [mlir] [libcxx] [libunwind] [clang] [lldb] [libc] [llvm] [compiler-rt] [AMDGPU] Fix broken sign-extended subword buffer load combine (PR #7747
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/77470 >From ae231d88c5b5e2e0996edefd45389992f8e97d05 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 9 Jan 2024 13:16:24 + Subject: [PATCH 1/3] [AMDGPU] Precommit tests for broken combine Add tests for sign-extending the result of an unsigned subword buffer load from the wrong width. --- .../llvm.amdgcn.struct.buffer.load.ll | 82 +++ 1 file changed, 82 insertions(+) diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll index 81c0f7557e6417..fcd7821a86897e 100644 --- a/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll +++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll @@ -500,6 +500,47 @@ define amdgpu_ps float @struct_buffer_load_i8_sext__sgpr_rsrc__vgpr_vindex__vgpr ret float %cast } +define amdgpu_ps float @struct_buffer_load_i8_sext_wrong_width(<4 x i32> inreg %rsrc, i32 %vindex, i32 %voffset, i32 inreg %soffset) { + ; GFX8-LABEL: name: struct_buffer_load_i8_sext_wrong_width + ; GFX8: bb.1 (%ir-block.0): + ; GFX8-NEXT: liveins: $sgpr2, $sgpr3, $sgpr4, $sgpr5, $sgpr6, $vgpr0, $vgpr1 + ; GFX8-NEXT: {{ $}} + ; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr2 + ; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr3 + ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY $sgpr4 + ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY $sgpr5 + ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1, [[COPY2]], %subreg.sub2, [[COPY3]], %subreg.sub3 + ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8-NEXT: [[COPY6:%[0-9]+]]:sreg_32 = COPY $sgpr6 + ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY4]], %subreg.sub0, [[COPY5]], %subreg.sub1 + ; GFX8-NEXT: [[BUFFER_LOAD_SBYTE_BOTHEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_SBYTE_BOTHEN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], [[COPY6]], 0, 0, 0, implicit $exec :: (dereferenceable load (s8), addrspace 8) + ; GFX8-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_SBYTE_BOTHEN]] + ; GFX8-NEXT: SI_RETURN_TO_EPILOG implicit $vgpr0 + ; + ; GFX12-LABEL: name: struct_buffer_load_i8_sext_wrong_width + ; GFX12: bb.1 (%ir-block.0): + ; GFX12-NEXT: liveins: $sgpr2, $sgpr3, $sgpr4, $sgpr5, $sgpr6, $vgpr0, $vgpr1 + ; GFX12-NEXT: {{ $}} + ; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr2 + ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr3 + ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY $sgpr4 + ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY $sgpr5 + ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1, [[COPY2]], %subreg.sub2, [[COPY3]], %subreg.sub3 + ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX12-NEXT: [[COPY6:%[0-9]+]]:sreg_32 = COPY $sgpr6 + ; GFX12-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY4]], %subreg.sub0, [[COPY5]], %subreg.sub1 + ; GFX12-NEXT: [[BUFFER_LOAD_SBYTE_VBUFFER_BOTHEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_SBYTE_VBUFFER_BOTHEN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], [[COPY6]], 0, 0, 0, implicit $exec :: (dereferenceable load (s8), addrspace 8) + ; GFX12-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_SBYTE_VBUFFER_BOTHEN]] + ; GFX12-NEXT: SI_RETURN_TO_EPILOG implicit $vgpr0 + %val = call i8 @llvm.amdgcn.struct.buffer.load.i8(<4 x i32> %rsrc, i32 %vindex, i32 %voffset, i32 %soffset, i32 0) + %trunc = trunc i8 %val to i4 + %ext = sext i4 %trunc to i32 + %cast = bitcast i32 %ext to float + ret float %cast +} + define amdgpu_ps float @struct_buffer_load_i16_zext__sgpr_rsrc__vgpr_vindex__vgpr_voffset__sgpr_soffset(<4 x i32> inreg %rsrc, i32 %vindex, i32 %voffset, i32 inreg %soffset) { ; GFX8-LABEL: name: struct_buffer_load_i16_zext__sgpr_rsrc__vgpr_vindex__vgpr_voffset__sgpr_soffset ; GFX8: bb.1 (%ir-block.0): @@ -580,6 +621,47 @@ define amdgpu_ps float @struct_buffer_load_i16_sext__sgpr_rsrc__vgpr_vindex__vgp ret float %cast } +define amdgpu_ps float @struct_buffer_load_i16_sext_wrong_width(<4 x i32> inreg %rsrc, i32 %vindex, i32 %voffset, i32 inreg %soffset) { + ; GFX8-LABEL: name: struct_buffer_load_i16_sext_wrong_width + ; GFX8: bb.1 (%ir-block.0): + ; GFX8-NEXT: liveins: $sgpr2, $sgpr3, $sgpr4, $sgpr5, $sgpr6, $vgpr0, $vgpr1 + ; GFX8-NEXT: {{ $}} + ; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr2 + ; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr3 + ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY $sgpr4 + ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY $sgpr5 + ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1, [[COPY2]], %subreg.sub2, [[COPY3]], %subreg.sub3 + ; G
[Lldb-commits] [clang-tools-extra] [libc] [mlir] [lld] [libcxx] [libclc] [llvm] [clang] [flang] [libunwind] [lldb] [compiler-rt] [AMDGPU] Fix broken sign-extended subword buffer load combine (PR #7747
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/77470 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [compiler-rt] [libc] [libclc] [libcxxabi] [lld] [lldb] [llvm] [mlir] Add clarifying parenthesis around non-trivial conditions in ternary expressions. (PR #90391)
jayfoad wrote: AMDGPU changes are fine. https://github.com/llvm/llvm-project/pull/90391 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [libc] [compiler-rt] [flang] [lld] [llvm] [clang] [lldb] [clang-tools-extra] [libcxx] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)
jayfoad wrote: Can you add a GFX12 RUN line to clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl? That will probably require adding "fp8-conversion-insts" to the GFX12 part of TargetParser.cpp. You can do this in a separate patch if you want. https://github.com/llvm/llvm-project/pull/78414 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [llvm] [mlir] [openmp] [libcxx] [flang] [clang] [clang-tools-extra] [compiler-rt] [lld] [libc] AMDGPU: Do not generate non-temporal hint when Load_Tr intrinsic did not specify it
@@ -1348,6 +1348,14 @@ bool SITargetLowering::getTgtMemIntrinsic(IntrinsicInfo &Info, MachineMemOperand::MOVolatile; return true; } + case Intrinsic::amdgcn_global_load_tr: { jayfoad wrote: This case should also be handled in getAdrModeArguments below. https://github.com/llvm/llvm-project/pull/79104 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [libcxx] [lld] [libc] [lldb] [clang-tools-extra] [clang] [openmp] [compiler-rt] [llvm] [flang] [mlir] AMDGPU: Do not generate non-temporal hint when Load_Tr intrinsic did not specify it
https://github.com/jayfoad approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/79104 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] 291a7fc - [LLDB] Fix build error after D142214
Author: Jay Foad Date: 2023-01-23T12:28:06Z New Revision: 291a7fcf70db4d45c24b559fc867d3499b2e1e04 URL: https://github.com/llvm/llvm-project/commit/291a7fcf70db4d45c24b559fc867d3499b2e1e04 DIFF: https://github.com/llvm/llvm-project/commit/291a7fcf70db4d45c24b559fc867d3499b2e1e04.diff LOG: [LLDB] Fix build error after D142214 Added: Modified: lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp Removed: diff --git a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp index 6ae5f2ad9a2c..ed3b3e6da02b 100644 --- a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp +++ b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp @@ -1384,7 +1384,7 @@ bool DisassemblerLLVMC::MCDisasmInstance::IsLoad(llvm::MCInst &mc_inst) const { bool DisassemblerLLVMC::MCDisasmInstance::IsAuthenticated( llvm::MCInst &mc_inst) const { - auto InstrDesc = m_instr_info_up->get(mc_inst.getOpcode()); + const auto &InstrDesc = m_instr_info_up->get(mc_inst.getOpcode()); // Treat software auth traps (brk 0xc470 + aut key, where 0x70 == 'p', 0xc4 // == 'a' + 'c') as authenticated instructions for reporting purposes, in ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [AMDGPU] Add another SIFoldOperands instance after shrink (PR #67878)
jayfoad wrote: I've just tested this on 1 graphics shaders and it seems to make no difference at all. I tried gfx900 and gfx1100. Can anyone else from the graphics team confirm this? https://github.com/llvm/llvm-project/pull/67878 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [AMDGPU] Add another SIFoldOperands instance after shrink (PR #67878)
jayfoad wrote: I've taken another look at this. The patch does not show any benefit from running another `SIFoldOperands` pass _after_ `SIShrinkInstructions` per se; you get exactly the same results (modulo a couple of add instructions that have their operands commuted differently) if you put the second `SIFoldOperands` run _before_ `SIShrinkInstructions` instead. In other words `SIFoldOperands` is not idempotent, and the reason for the that seems to be: > And the reason it only happens for some SUBREV instructions is even more > convoluted. It's because SIFoldOperands will sometimes shrink > V_SUB_CO_U32_e64 to V_SUBREV_CO_U32_e32 even it does not manage to fold > anything into it. This does seem wrong and is probably worth a closer look. This goes back to https://reviews.llvm.org/D51345. Notice how the code that was added to `updateOperand` does the shrinking but does not actually do any folding; it returns before we get to `Old.ChangeToImmediate`/`Old.substVirtReg`. A second run of `SIFoldOperands` will see the shrunk instruction and fold into it. https://github.com/llvm/llvm-project/pull/67878 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [AMDGPU] Fix image intrinsic optimizer on loads from different resources (PR #69355)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/69355 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [compiler-rt] [flang] [libc] [libcxx] [llvm] [lldb] [clang-tools-extra] [clang] [AMDGPU] Fix nondeterminism in SIFixSGPRCopies (PR #70644)
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/70644 >From bfc7b2041f5a05105808b0b1ee0427d9c9eb9f4b Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Mon, 30 Oct 2023 15:23:48 + Subject: [PATCH 1/4] Precommit test --- .../AMDGPU/fix-sgpr-copies-nondeterminism.ll | 52 +++ 1 file changed, 52 insertions(+) create mode 100644 llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-nondeterminism.ll diff --git a/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-nondeterminism.ll b/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-nondeterminism.ll new file mode 100644 index 000..8b7e691dbddeae5 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-nondeterminism.ll @@ -0,0 +1,52 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3 +; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 < %s | FileCheck %s + +define amdgpu_gs void @f(i32 inreg %arg, i32 %arg1, i32 %arg2) { +; CHECK-LABEL: f: +; CHECK: ; %bb.0: ; %bb +; CHECK-NEXT:s_cmp_eq_u32 s0, 0 +; CHECK-NEXT:s_mov_b32 s0, 0 +; CHECK-NEXT:s_cbranch_scc1 .LBB0_2 +; CHECK-NEXT: ; %bb.1: ; %bb3 +; CHECK-NEXT:v_mov_b32_e32 v4, v1 +; CHECK-NEXT:s_branch .LBB0_3 +; CHECK-NEXT: .LBB0_2: +; CHECK-NEXT:v_mov_b32_e32 v0, 1 +; CHECK-NEXT:v_mov_b32_e32 v4, 0 +; CHECK-NEXT: .LBB0_3: ; %bb4 +; CHECK-NEXT:v_mov_b32_e32 v1, 0 +; CHECK-NEXT:s_mov_b32 s1, s0 +; CHECK-NEXT:s_mov_b32 s2, s0 +; CHECK-NEXT:s_mov_b32 s3, s0 +; CHECK-NEXT:s_delay_alu instid0(VALU_DEP_1) +; CHECK-NEXT:v_mov_b32_e32 v2, v1 +; CHECK-NEXT:v_mov_b32_e32 v3, v1 +; CHECK-NEXT:v_mov_b32_e32 v5, v1 +; CHECK-NEXT:v_mov_b32_e32 v6, v1 +; CHECK-NEXT:v_mov_b32_e32 v7, v1 +; CHECK-NEXT:s_clause 0x1 +; CHECK-NEXT:buffer_store_b128 v[0:3], v1, s[0:3], 0 idxen +; CHECK-NEXT:buffer_store_b128 v[4:7], v1, s[0:3], 0 idxen +; CHECK-NEXT:s_nop 0 +; CHECK-NEXT:s_sendmsg sendmsg(MSG_DEALLOC_VGPRS) +; CHECK-NEXT:s_endpgm +bb: + %i = icmp eq i32 %arg, 0 + br i1 %i, label %bb4, label %bb3 + +bb3: + br label %bb4 + +bb4: + %i5 = phi i32 [ %arg1, %bb3 ], [ 1, %bb ] + %i6 = phi i32 [ %arg2, %bb3 ], [ 0, %bb ] + %i7 = insertelement <4 x i32> zeroinitializer, i32 %i5, i64 0 + %i8 = bitcast <4 x i32> %i7 to <4 x float> + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %i8, <4 x i32> zeroinitializer, i32 0, i32 0, i32 0, i32 0) + %i9 = insertelement <4 x i32> zeroinitializer, i32 %i6, i64 0 + %i10 = bitcast <4 x i32> %i9 to <4 x float> + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %i10, <4 x i32> zeroinitializer, i32 0, i32 0, i32 0, i32 0) + ret void +} + +declare void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32 immarg) >From aa050e8d720150b97d7af18d97d1d7f5d010bedc Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Mon, 30 Oct 2023 10:40:22 + Subject: [PATCH 2/4] [AMDGPU] Fix nondeterminism in SIFixSGPRCopies There are a couple of loops that iterate over V2SCopies. The iteration order needs to be deterministic, otherwise we can call moveToVALU in different orders, which causes temporary vregs to be allocated in different orders, which can affect register allocation heuristics. --- llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp| 8 +++ .../AMDGPU/fix-sgpr-copies-nondeterminism.ll | 22 +-- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp b/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp index b32ed9fef5dd34e..3e6ed2d793ae563 100644 --- a/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp +++ b/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp @@ -125,7 +125,7 @@ class SIFixSGPRCopies : public MachineFunctionPass { SmallVector PHINodes; SmallVector S2VCopies; unsigned NextVGPRToSGPRCopyID; - DenseMap V2SCopies; + MapVector V2SCopies; DenseMap> SiblingPenalty; public: @@ -988,7 +988,7 @@ bool SIFixSGPRCopies::needToBeConvertedToVALU(V2SCopyInfo *Info) { for (auto J : Info->Siblings) { auto InfoIt = V2SCopies.find(J); if (InfoIt != V2SCopies.end()) { - MachineInstr *SiblingCopy = InfoIt->getSecond().Copy; + MachineInstr *SiblingCopy = InfoIt->second.Copy; if (SiblingCopy->isImplicitDef()) // the COPY has already been MoveToVALUed continue; @@ -1023,12 +1023,12 @@ void SIFixSGPRCopies::lowerVGPR2SGPRCopies(MachineFunction &MF) { unsigned CurID = LoweringWorklist.pop_back_val(); auto CurInfoIt = V2SCopies.find(CurID); if (CurInfoIt != V2SCopies.end()) { - V2SCopyInfo C = CurInfoIt->getSecond(); + V2SCopyInfo C = CurInfoIt->second; LLVM_DEBUG(dbgs() << "Processing ...\n"; C.dump()); for (auto S : C.Siblings) { auto SibInfoIt = V2SCopies.find(S); if (SibInfoIt != V2SCopies.end()) { - V2SCopyInfo &SI = SibInfoIt->getSecond(); + V2SCopyInfo &SI = SibInfoIt->second; L
[Lldb-commits] [llvm] [libc] [libcxx] [lldb] [flang] [compiler-rt] [clang-tools-extra] [clang] [AMDGPU] Fix nondeterminism in SIFixSGPRCopies (PR #70644)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/70644 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [llvm] [clang] [flang] [clang-tools-extra] [openmp] [mlir] [libcxx] [lldb] [libc] GlobalISel: Guide return in llvm::getIConstantSplatVal (PR #71989)
jayfoad wrote: Typo in subject "**Guard** return ..."? https://github.com/llvm/llvm-project/pull/71989 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [flang] [lldb] [llvm] [mlir] Fix typo "instrinsic" (PR #112899)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/112899 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [flang] [lldb] [llvm] [mlir] Fix typo "instrinsic" (PR #112899)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/112899 None >From 3a3b67f30cde766adaede4cc53bec340fbe5d99f Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Fri, 18 Oct 2024 13:53:51 +0100 Subject: [PATCH] Fix typo "instrinsic" --- clang/utils/TableGen/RISCVVEmitter.cpp | 4 ++-- flang/docs/OptionComparison.md | 2 +- flang/include/flang/Runtime/magic-numbers.h | 2 +- flang/lib/Evaluate/intrinsics.cpp| 2 +- flang/lib/Optimizer/Builder/Runtime/Numeric.cpp | 6 +++--- flang/lib/Optimizer/Builder/Runtime/Reduction.cpp| 2 +- lldb/CMakeLists.txt | 2 +- llvm/include/llvm/IR/IntrinsicsAMDGPU.td | 2 +- llvm/include/llvm/Transforms/Utils/SSAUpdater.h | 2 +- llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp | 2 +- llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td | 2 +- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 2 +- llvm/test/Bitcode/upgrade-aarch64-sve-intrinsics.ll | 2 +- llvm/test/CodeGen/SystemZ/vec-reduce-add-01.ll | 2 +- llvm/test/Transforms/JumpThreading/thread-debug-info.ll | 2 +- llvm/test/Transforms/SROA/fake-use-sroa.ll | 2 +- llvm/unittests/FuzzMutate/RandomIRBuilderTest.cpp| 2 +- mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp | 2 +- mlir/lib/Target/LLVMIR/ModuleImport.cpp | 2 +- 19 files changed, 22 insertions(+), 22 deletions(-) diff --git a/clang/utils/TableGen/RISCVVEmitter.cpp b/clang/utils/TableGen/RISCVVEmitter.cpp index 50f161fd38ce69..aecca0f5df8d93 100644 --- a/clang/utils/TableGen/RISCVVEmitter.cpp +++ b/clang/utils/TableGen/RISCVVEmitter.cpp @@ -169,7 +169,7 @@ static VectorTypeModifier getTupleVTM(unsigned NF) { static unsigned getIndexedLoadStorePtrIdx(const RVVIntrinsic *RVVI) { // We need a special rule for segment load/store since the data width is not - // encoded in the instrinsic name itself. + // encoded in the intrinsic name itself. const StringRef IRName = RVVI->getIRName(); constexpr unsigned RVV_VTA = 0x1; constexpr unsigned RVV_VMA = 0x2; @@ -192,7 +192,7 @@ static unsigned getIndexedLoadStorePtrIdx(const RVVIntrinsic *RVVI) { static unsigned getSegInstLog2SEW(StringRef InstName) { // clang-format off // We need a special rule for indexed segment load/store since the data width - // is not encoded in the instrinsic name itself. + // is not encoded in the intrinsic name itself. if (InstName.starts_with("vloxseg") || InstName.starts_with("vluxseg") || InstName.starts_with("vsoxseg") || InstName.starts_with("vsuxseg")) return (unsigned)-1; diff --git a/flang/docs/OptionComparison.md b/flang/docs/OptionComparison.md index 9d6916ef62af2e..fb65498fa1f444 100644 --- a/flang/docs/OptionComparison.md +++ b/flang/docs/OptionComparison.md @@ -53,7 +53,7 @@ eN fdec, -fall-instrinsics +fall-intrinsics https://www-01.ibm.com/support/docview.wss?uid=swg27024803&aid=1#page=297";>qxlf77, diff --git a/flang/include/flang/Runtime/magic-numbers.h b/flang/include/flang/Runtime/magic-numbers.h index bab0e9ae05299a..1d3c5dca0b4bfb 100644 --- a/flang/include/flang/Runtime/magic-numbers.h +++ b/flang/include/flang/Runtime/magic-numbers.h @@ -107,7 +107,7 @@ The denorm value is a nonstandard extension. #if 0 ieee_round_type values -The values are those of the llvm.get.rounding instrinsic, which is assumed by +The values are those of the llvm.get.rounding intrinsic, which is assumed by ieee_arithmetic module rounding procedures. #endif #define _FORTRAN_RUNTIME_IEEE_TO_ZERO 0 diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index 4271faa0db12bf..aa44967817722e 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -1690,7 +1690,7 @@ std::optional IntrinsicInterface::Match( // MAX and MIN (and others that map to them) allow their last argument to // be repeated indefinitely. The actualForDummy vector is sized // and null-initialized to the non-repeated dummy argument count - // for other instrinsics. + // for other intrinsics. bool isMaxMin{dummyArgPatterns > 0 && dummy[dummyArgPatterns - 1].optionality == Optionality::repeats}; std::vector actualForDummy( diff --git a/flang/lib/Optimizer/Builder/Runtime/Numeric.cpp b/flang/lib/Optimizer/Builder/Runtime/Numeric.cpp index c13064a284d127..d0092add0118f1 100644 --- a/flang/lib/Optimizer/Builder/Runtime/Numeric.cpp +++ b/flang/lib/Optimizer/Builder/Runtime/Numeric.cpp @@ -284,7 +284,7 @@ struct ForcedSpacing16 { } }; -/// Generate call to Exponent instrinsic runtime routine. +/// Generate call to Exponent intrinsic runtime routine. mlir::Value fir::runtime::genExponent(fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type resultType,
[Lldb-commits] [clang] [lldb] [AMDGPU] Specify width and align for all AMDGPU builtin types. NFC. (PR #109656)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/109656 This will be used in ASTContext::getTypeInfo which needs this information for all builtin types, not just pointers. >From 0ef4ea17a711a1ee95080bc1635ae9aa824df596 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 17 Sep 2024 15:06:41 +0100 Subject: [PATCH] [AMDGPU] Specify width and align for all AMDGPU builtin types. NFC. This will be used in ASTContext::getTypeInfo which needs this information for all builtin types, not just pointers. --- clang/include/clang/AST/ASTContext.h | 3 ++- clang/include/clang/AST/Type.h | 2 +- clang/include/clang/AST/TypeProperties.td| 2 +- clang/include/clang/Basic/AMDGPUTypes.def| 6 +++--- clang/include/clang/Serialization/ASTBitCodes.h | 2 +- clang/lib/AST/ASTContext.cpp | 8 clang/lib/AST/ASTImporter.cpp| 2 +- clang/lib/AST/ExprConstant.cpp | 2 +- clang/lib/AST/ItaniumMangle.cpp | 2 +- clang/lib/AST/MicrosoftMangle.cpp| 2 +- clang/lib/AST/NSAPI.cpp | 2 +- clang/lib/AST/PrintfFormatString.cpp | 2 +- clang/lib/AST/Type.cpp | 4 ++-- clang/lib/AST/TypeLoc.cpp| 2 +- clang/lib/CodeGen/CGDebugInfo.cpp| 2 +- clang/lib/CodeGen/CGDebugInfo.h | 3 ++- clang/lib/CodeGen/CodeGenTypes.cpp | 2 +- clang/lib/CodeGen/ItaniumCXXABI.cpp | 2 +- clang/lib/Index/USRGeneration.cpp| 2 +- clang/lib/Sema/Sema.cpp | 2 +- clang/lib/Sema/SemaExpr.cpp | 4 ++-- clang/lib/Serialization/ASTCommon.cpp| 2 +- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/tools/libclang/CIndex.cpp | 2 +- lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp | 3 ++- 25 files changed, 35 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index 1984310df0442e..f46e12a57d4c38 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -1197,7 +1197,8 @@ class ASTContext : public RefCountedBase { #include "clang/Basic/RISCVVTypes.def" #define WASM_TYPE(Name, Id, SingletonId) CanQualType SingletonId; #include "clang/Basic/WebAssemblyReferenceTypes.def" -#define AMDGPU_TYPE(Name, Id, SingletonId) CanQualType SingletonId; +#define AMDGPU_TYPE(Name, Id, SingletonId, Width, Align) \ + CanQualType SingletonId; #include "clang/Basic/AMDGPUTypes.def" #define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) CanQualType SingletonId; #include "clang/Basic/HLSLIntangibleTypes.def" diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index dc87b84153e74a..cee4e68fe0dc6d 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -3050,7 +3050,7 @@ class BuiltinType : public Type { #define WASM_TYPE(Name, Id, SingletonId) Id, #include "clang/Basic/WebAssemblyReferenceTypes.def" // AMDGPU types -#define AMDGPU_TYPE(Name, Id, SingletonId) Id, +#define AMDGPU_TYPE(Name, Id, SingletonId, Width, Align) Id, #include "clang/Basic/AMDGPUTypes.def" // HLSL intangible Types #define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) Id, diff --git a/clang/include/clang/AST/TypeProperties.td b/clang/include/clang/AST/TypeProperties.td index bb7bfa8cd0b76e..d05072607e949c 100644 --- a/clang/include/clang/AST/TypeProperties.td +++ b/clang/include/clang/AST/TypeProperties.td @@ -893,7 +893,7 @@ let Class = BuiltinType in { case BuiltinType::ID: return ctx.SINGLETON_ID; #include "clang/Basic/WebAssemblyReferenceTypes.def" -#define AMDGPU_TYPE(NAME, ID, SINGLETON_ID) \ +#define AMDGPU_TYPE(NAME, ID, SINGLETON_ID, WIDTH, ALIGN) \ case BuiltinType::ID: return ctx.SINGLETON_ID; #include "clang/Basic/AMDGPUTypes.def" diff --git a/clang/include/clang/Basic/AMDGPUTypes.def b/clang/include/clang/Basic/AMDGPUTypes.def index 7454d61f5dd516..e47e544fdc82c1 100644 --- a/clang/include/clang/Basic/AMDGPUTypes.def +++ b/clang/include/clang/Basic/AMDGPUTypes.def @@ -11,11 +11,11 @@ //===--===// #ifndef AMDGPU_OPAQUE_PTR_TYPE -#define AMDGPU_OPAQUE_PTR_TYPE(Name, AS, Width, Align, Id, SingletonId) \ - AMDGPU_TYPE(Name, Id, SingletonId) +#define AMDGPU_OPAQUE_PTR_TYPE(Name, Id, SingletonId, Width, Align, AS) \ + AMDGPU_TYPE(Name, Id, SingletonId, Width, Align) #endif -AMDGPU_OPAQUE_PTR_TYPE("__amdgpu_buffer_rsrc_t", 8, 128, 128, AMDGPUBufferRsrc, AMDGPUBufferRsrcTy) +AMDGPU_OPAQUE_PTR_TYPE("__amdgpu_buffer
[Lldb-commits] [clang] [lldb] [AMDGPU] Specify width and align for all AMDGPU builtin types. NFC. (PR #109656)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/109656 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [clang-tools-extra] [flang] [libc] [lldb] [llvm] [mlir] [llvm-project] Fix typos mutli and mutliple. NFC. (PR #122880)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/122880 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [clang-tools-extra] [flang] [libc] [lldb] [llvm] [mlir] [llvm-project] Fix typos mutli and mutliple. NFC. (PR #122880)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/122880 None >From d9a92edae5d021eed39acbdb22fa195dff78315d Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 14 Jan 2025 10:00:41 + Subject: [PATCH] [llvm-project] Fix typos mutli and mutliple. NFC. --- .../clang-tidy/modernize/UseAutoCheck.cpp| 4 ++-- clang/lib/Basic/SourceManager.cpp| 2 +- flang/test/HLFIR/associate-codegen.fir | 2 +- libc/test/src/unistd/getopt_test.cpp | 2 +- lldb/source/Commands/CommandObjectMemory.cpp | 2 +- lldb/source/Target/StructuredDataPlugin.cpp | 2 +- lldb/unittests/Target/RegisterFlagsTest.cpp | 2 +- llvm/include/llvm/IR/DebugInfoMetadata.h | 2 +- llvm/lib/Target/X86/X86LowerAMXType.cpp | 2 +- llvm/test/CodeGen/AArch64/eon.ll | 2 +- llvm/test/DebugInfo/X86/multiple-at-const-val.ll | 2 +- llvm/test/Transforms/EarlyCSE/guards.ll | 2 +- .../InstCombine/matrix-multiplication-negation.ll| 12 ++-- .../RISCV/blend-any-of-reduction-cost.ll | 4 ++-- .../one-shot-bufferize-empty-tensor-elimination.mlir | 12 ++-- mlir/test/Transforms/mem2reg.mlir| 2 +- mlir/test/Transforms/sroa.mlir | 2 +- 17 files changed, 29 insertions(+), 29 deletions(-) diff --git a/clang-tools-extra/clang-tidy/modernize/UseAutoCheck.cpp b/clang-tools-extra/clang-tidy/modernize/UseAutoCheck.cpp index aec67808846b12..7a2d804e173ce4 100644 --- a/clang-tools-extra/clang-tidy/modernize/UseAutoCheck.cpp +++ b/clang-tools-extra/clang-tidy/modernize/UseAutoCheck.cpp @@ -342,7 +342,7 @@ static void ignoreTypeLocClasses( Loc = Loc.getNextTypeLoc(); } -static bool isMutliLevelPointerToTypeLocClasses( +static bool isMultiLevelPointerToTypeLocClasses( TypeLoc Loc, std::initializer_list const &LocClasses) { ignoreTypeLocClasses(Loc, {TypeLoc::Paren, TypeLoc::Qualified}); @@ -424,7 +424,7 @@ void UseAutoCheck::replaceExpr( auto Diag = diag(Range.getBegin(), Message); - bool ShouldReplenishVariableName = isMutliLevelPointerToTypeLocClasses( + bool ShouldReplenishVariableName = isMultiLevelPointerToTypeLocClasses( TSI->getTypeLoc(), {TypeLoc::FunctionProto, TypeLoc::ConstantArray}); // Space after 'auto' to handle cases where the '*' in the pointer type is diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp index 44e982d3ee67fb..b1f2180c1d4627 100644 --- a/clang/lib/Basic/SourceManager.cpp +++ b/clang/lib/Basic/SourceManager.cpp @@ -1222,7 +1222,7 @@ unsigned SourceManager::getPresumedColumnNumber(SourceLocation Loc, return PLoc.getColumn(); } -// Check if mutli-byte word x has bytes between m and n, included. This may also +// Check if multi-byte word x has bytes between m and n, included. This may also // catch bytes equal to n + 1. // The returned value holds a 0x80 at each byte position that holds a match. // see http://graphics.stanford.edu/~seander/bithacks.html#HasBetweenInWord diff --git a/flang/test/HLFIR/associate-codegen.fir b/flang/test/HLFIR/associate-codegen.fir index f5e015c4169f60..ad64959984a14a 100644 --- a/flang/test/HLFIR/associate-codegen.fir +++ b/flang/test/HLFIR/associate-codegen.fir @@ -372,7 +372,7 @@ func.func @_QPtest_multiple_expr_uses_inside_elemental() { // CHECK: return // CHECK: } -// Verify that we properly recognize mutliple consequent hlfir.associate using +// Verify that we properly recognize multiple consequent hlfir.associate using // the same result of hlfir.elemental. func.func @_QPtest_multitple_associates_for_same_expr() { %c1 = arith.constant 1 : index diff --git a/libc/test/src/unistd/getopt_test.cpp b/libc/test/src/unistd/getopt_test.cpp index e6e87720cde48d..8217f7bb6e7313 100644 --- a/libc/test/src/unistd/getopt_test.cpp +++ b/libc/test/src/unistd/getopt_test.cpp @@ -155,7 +155,7 @@ TEST_F(LlvmLibcGetoptTest, ParseArgInNext) { EXPECT_EQ(test_globals::optind, 3); } -TEST_F(LlvmLibcGetoptTest, ParseMutliInOne) { +TEST_F(LlvmLibcGetoptTest, ParseMultiInOne) { array argv{"prog"_c, "-abc"_c, nullptr}; EXPECT_EQ(LIBC_NAMESPACE::getopt(2, argv.data(), "abc"), (int)'a'); diff --git a/lldb/source/Commands/CommandObjectMemory.cpp b/lldb/source/Commands/CommandObjectMemory.cpp index b5612f21f11563..164c61d1720171 100644 --- a/lldb/source/Commands/CommandObjectMemory.cpp +++ b/lldb/source/Commands/CommandObjectMemory.cpp @@ -1737,7 +1737,7 @@ class CommandObjectMemoryRegion : public CommandObjectParsed { // It is important that we track the address used to request the region as // this will give the correct section name in the case that regions overlap. -// On Windows we get mutliple regions that start at the same place but are +// On Windows we get multiple regions that start at the same place but are
[Lldb-commits] [clang] [lldb] [llvm] [mlir] [NFC][Support] Add llvm::uninitialized_copy (PR #138174)
@@ -2981,7 +2981,7 @@ ScalarEvolution::getOrCreateAddExpr(ArrayRef Ops, static_cast(UniqueSCEVs.FindNodeOrInsertPos(ID, IP)); if (!S) { const SCEV **O = SCEVAllocator.Allocate(Ops.size()); -std::uninitialized_copy(Ops.begin(), Ops.end(), O); +llvm::uninitialized_copy(Ops, O); jayfoad wrote: Do you need the `llvm::`? Most .cpp files in LLVM have a `using namespace llvm;`. https://github.com/llvm/llvm-project/pull/138174 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] Fix typos: paramter, parametr, parametere (PR #134092)
jayfoad wrote: > Please try not to create big PRs that are crossing subproject boundaries, > you'll have a hard time finding a reviewer will to sign off of a patch that > Rouches other parts of the project. 10 small/trivial PRs are less work for > the reviewers than one big one. Fair enough. https://github.com/llvm/llvm-project/pull/134092 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] Fix typos: paramter, parametr, parametere (PR #134092)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/134092 None >From 10ab02a61b5e2ec3f58a12c78e7b3b989a349a8f Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Wed, 2 Apr 2025 15:35:32 +0100 Subject: [PATCH] Fix typos: paramter, parametr, parametere --- bolt/utils/bughunter.sh | 2 +- clang-tools-extra/clangd/SemanticHighlighting.cpp | 2 +- clang/docs/ReleaseNotes.rst | 4 ++-- clang/include/clang/Basic/AttrDocs.td | 2 +- clang/include/clang/Basic/DiagnosticASTKinds.td | 2 +- clang/include/clang/Basic/arm_cde.td| 2 +- clang/lib/AST/ByteCode/Compiler.cpp | 3 ++- clang/lib/Analysis/UnsafeBufferUsage.cpp| 2 +- clang/lib/CodeGen/CodeGenFunction.h | 2 +- clang/lib/CodeGen/TargetBuiltins/ARM.cpp| 2 +- .../Checkers/RetainCountChecker/RetainCountDiagnostics.cpp | 2 +- .../StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/WebKit/PtrTypesSemantics.cpp| 2 +- clang/test/Modules/odr_hash.cpp | 6 +++--- clang/unittests/Format/FormatTest.cpp | 4 ++-- flang/lib/Lower/OpenMP/PrivateReductionUtils.cpp| 2 +- flang/lib/Optimizer/CodeGen/TypeConverter.cpp | 2 +- lldb/source/Plugins/Platform/Windows/PlatformWindows.cpp| 2 +- llvm/lib/Target/PowerPC/PPCMachineFunctionInfo.cpp | 6 +++--- llvm/lib/Target/PowerPC/PPCMachineFunctionInfo.h| 4 ++-- llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp| 2 +- llvm/test/CodeGen/AArch64/sched-movprfx.ll | 2 +- .../sve2-intrinsics-fp-int-binary-logarithm-zeroing.ll | 2 +- llvm/test/MC/WebAssembly/block-assembly.s | 2 +- llvm/unittests/DebugInfo/DWARF/DWARFFormValueTest.cpp | 2 +- mlir/include/mlir/Analysis/Presburger/PWMAFunction.h| 2 +- mlir/include/mlir/Dialect/Transform/IR/TransformOps.td | 2 +- mlir/include/mlir/TableGen/Class.h | 4 ++-- .../Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.cpp | 2 +- mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp | 2 +- 30 files changed, 39 insertions(+), 38 deletions(-) diff --git a/bolt/utils/bughunter.sh b/bolt/utils/bughunter.sh index c5dddc41fb41f..d5ce0592708e2 100755 --- a/bolt/utils/bughunter.sh +++ b/bolt/utils/bughunter.sh @@ -28,7 +28,7 @@ # # TIMEOUT_OR_CMD- optional timeout or command on optimized binary command # if the value is a number with an optional trailing letter -# [smhd] it is considered a paramter to "timeout", +# [smhd] it is considered a parameter to "timeout", # otherwise it's a shell command that wraps the optimized # binary command. # diff --git a/clang-tools-extra/clangd/SemanticHighlighting.cpp b/clang-tools-extra/clangd/SemanticHighlighting.cpp index 86ca05644c703..1e9ca0ae7822d 100644 --- a/clang-tools-extra/clangd/SemanticHighlighting.cpp +++ b/clang-tools-extra/clangd/SemanticHighlighting.cpp @@ -876,7 +876,7 @@ class CollectExtraHighlightings if (auto *ProtoType = FD->getType()->getAs()) { // Iterate over the types of the function parameters. - // If any of them are non-const reference paramteres, add it as a + // If any of them are non-const reference parameteres, add it as a // highlighting modifier to the corresponding expression for (size_t I = 0; I < std::min(size_t(ProtoType->getNumParams()), Args.size()); ++I) { diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 7978df0cc71cc..881bcd7415edd 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -381,12 +381,12 @@ Bug Fixes to C++ Support now also works if the constraint has non-type or template template parameters. (#GH131798) - Fixes matching of nested template template parameters. (#GH130362) -- Correctly diagnoses template template paramters which have a pack parameter +- Correctly diagnoses template template parameters which have a pack parameter not in the last position. - Clang now correctly parses ``if constexpr`` expressions in immediate function context. (#GH123524) - Fixed an assertion failure affecting code that uses C++23 "deducing this". (#GH130272) - Clang now properly instantiates destructors for initialized members within non-delegating constructors. (#GH93251) -- Correctly diagnoses if unresolved using declarations shadows template paramters (#GH129411) +- Correctly diagnoses if unresolved using declarations shadows template parameters (#GH129411) - Fixed C++20 aggregate initialization rules being incorrectly applied in certain contexts. (#GH131320) - Clang was previously coale
[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] Fix typos: paramter, parametr, parametere (PR #134092)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/134092 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits