[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Simplify spliceDebugInfo, fixing splice-to-end edge case (#105670) (PR #106690)
https://github.com/SLTozer approved this pull request. This is a straightforward bugfix. https://github.com/llvm/llvm-project/pull/106690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -226,8 +230,44 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) { bool SameCol = L1->getColumn() == L2->getColumn(); unsigned Line = SameLine ? L1->getLine() : 0; unsigned Col = SameLine && SameCol ? L1->getColumn() : 0; - -return DILocation::get(C, Line, Col, Scope, InlinedAt); +bool IsImplicitCode = L1->isImplicitCode() && L2->isImplicitCode(); +uint64_t Group = 0; +uint64_t Rank = 0; +if (SameLine) { + if (L1->getAtomGroup() || L2->getAtomGroup()) { +// If we're preserving the same matching inlined-at field we can +// preserve the atom. +if (LocBIA == LocAIA && InlinedAt == LocBIA) { + // Deterministically keep the lowest non-zero ranking atom group + // number. + // FIXME: It would be nice if we could track that an instruction + // belongs to two source atoms. + bool UseL1Atom = [L1, L2]() { +if (L1->getAtomRank() == L2->getAtomRank()) { + // Arbitrarily choose the lowest non-zero group number. + if (!L1->getAtomGroup() || !L2->getAtomGroup()) +return !L2->getAtomGroup(); + return L1->getAtomGroup() < L2->getAtomGroup(); +} +// Choose the lowest non-zero rank. +if (!L1->getAtomRank() || !L2->getAtomRank()) + return !L2->getAtomRank(); +return L1->getAtomRank() < L2->getAtomRank(); + }(); + Group = UseL1Atom ? L1->getAtomGroup() : L2->getAtomGroup(); + Rank = UseL1Atom ? L1->getAtomRank() : L2->getAtomRank(); +} else { + // If either instruction is part of a source atom, reassign it a new + // atom group. This essentially regresses to non-key-instructions + // behaviour (now that it's the only instruction in its group it'll + // probably get is_stmt applied). + Group = C.incNextAtomGroup(); + Rank = 1; SLTozer wrote: Is this necessary? Since we use `inlinedAt` as part of the tuple alongside `atomGroup`, keeping the group the same would still result in the merged instruction becoming part of a distinct "group" (with `is_stmt` likely applying). Likewise, since we're creating a new group it sounds to me like it would be unnecessary to change the rank? https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/133485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)
@@ -2145,6 +2145,13 @@ class DILocation : public MDNode { return 0; } + const DILocation *getOrCloneWithoutAtom() const { SLTozer wrote: I think this could just be "getWithoutAtom", it's already implied with DIMetadata that "get" means "find me an existing metadata or create a new one". https://github.com/llvm/llvm-project/pull/133485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)
https://github.com/SLTozer approved this pull request. I think conceptually there is some space for copying atom group/rank to the inlined instructions, giving the instruction(s) that produce the return value (if any) the highest precedence. This would be a separate feature however, and this behaviour seems fine to me as a first pass; minor inline comment. https://github.com/llvm/llvm-project/pull/133485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
@@ -1182,6 +1187,19 @@ static void cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses( U.set(NewBonusInst); } } + + // Key Instructions: We may have propagated atom info into the pred. If the + // pred's terminator already has atom info do nothing as merging would drop + // one atom group anyway. If it doesn't, propagte the remapped atom group SLTozer wrote: ```suggestion // one atom group anyway. If it doesn't, propagate the remapped atom group ``` https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)
@@ -3609,11 +3609,11 @@ foldCondBranchOnValueKnownInPredecessorImpl(BranchInst *BI, DomTreeUpdater *DTU, N->setName(BBI->getName() + ".c"); // Update operands due to translation. - for (Use &Op : N->operands()) { -DenseMap::iterator PI = TranslateMap.find(Op); -if (PI != TranslateMap.end()) - Op = PI->second; - } + // Key Instructions: Remap all the atom groups. + if (const DebugLoc &DL = BBI->getDebugLoc()) +mapAtomInstance(DL, TranslateMap); + RemapInstruction(N, TranslateMap, + RF_IgnoreMissingLocals | RF_NoModuleLevelChanges); SLTozer wrote: If I understand right, `RemapInstruction` with these operands will never create a new value mapping, only ever return an existing mapping - is that correct? https://github.com/llvm/llvm-project/pull/133484 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/133484 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/133481 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)
https://github.com/SLTozer approved this pull request. LGTM, besides a couple inline comments. https://github.com/llvm/llvm-project/pull/133481 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)
@@ -0,0 +1,53 @@ +; RUN: opt %s -passes=inline -S -o - | FileCheck %s + +;; Inline `f` into `g`. The inlined assignment store and add should retain +;; their atom info. + +; CHECK: _Z1gi +; CHECK-NOT: _Z1fi +; CHECK: %add.i = add nsw i32 %mul.i, 1, !dbg [[G1R2:!.*]] +; CHECK-NEXT: store i32 %add.i, ptr %x.i, align 4, !dbg [[G1R1:!.*]] + +; CHECK: [[G1R2]] = !DILocation({{.*}}, atomGroup: 1, atomRank: 2) +; CHECK: [[G1R1]] = !DILocation({{.*}}, atomGroup: 1, atomRank: 1) + +define hidden void @_Z1fi(i32 noundef %a) !dbg !11 { +entry: + %a.addr = alloca i32, align 4 + %x = alloca i32, align 4 + store i32 %a, ptr %a.addr, align 4 + %0 = load i32, ptr %a.addr, align 4, !dbg !18 + %mul = mul nsw i32 %0, 2, !dbg !18 + %add = add nsw i32 %mul, 1, !dbg !19 + store i32 %add, ptr %x, align 4, !dbg !20 + ret void, !dbg !22 +} + +define hidden void @_Z1gi(i32 noundef %b) !dbg !23 { +entry: + %b.addr = alloca i32, align 4 + store i32 %b, ptr %b.addr, align 4 + %0 = load i32, ptr %b.addr, align 4, !dbg !24 + call void @_Z1fi(i32 noundef %0), !dbg !24 + ret void, !dbg !25 SLTozer wrote: Nit, could remove DILocations from the instructions that aren't relevant to the test. https://github.com/llvm/llvm-project/pull/133481 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)
@@ -1813,7 +1813,8 @@ static DebugLoc inlineDebugLoc(DebugLoc OrigDL, DILocation *InlinedAt, DenseMap &IANodes) { auto IA = DebugLoc::appendInlinedAt(OrigDL, InlinedAt, Ctx, IANodes); return DILocation::get(Ctx, OrigDL.getLine(), OrigDL.getCol(), - OrigDL.getScope(), IA); + OrigDL.getScope(), IA, OrigDL.isImplicitCode(), SLTozer wrote: Similar to my comment on a different review, propagating `OrigDL`'s `IsImplicitCode` field is a change in behaviour. https://github.com/llvm/llvm-project/pull/133481 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][debugify] Add --debugify-atoms to add key instructions metadata (PR #133483)
https://github.com/SLTozer approved this pull request. https://github.com/llvm/llvm-project/pull/133483 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
https://github.com/SLTozer approved this pull request. Some minor nits, but this update looks correct. https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
@@ -1182,6 +1187,19 @@ static void cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses( U.set(NewBonusInst); } } + + // Key Instructions: We may have propagated atom info into the pred. If the + // pred's terminator already has atom info do nothing as merging would drop + // one atom group anyway. If it doesn't, propagte the remapped atom group + // from BB's terminator. + if (auto &PredDL = PredBlock->getTerminator()->getDebugLoc()) { SLTozer wrote: ```suggestion if (auto &PredDL = PTI->getDebugLoc()) { ``` If I understand it, `PTI` is still `PredBlock`'s terminator? https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -226,8 +230,44 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) { bool SameCol = L1->getColumn() == L2->getColumn(); unsigned Line = SameLine ? L1->getLine() : 0; unsigned Col = SameLine && SameCol ? L1->getColumn() : 0; - -return DILocation::get(C, Line, Col, Scope, InlinedAt); +bool IsImplicitCode = L1->isImplicitCode() && L2->isImplicitCode(); +uint64_t Group = 0; +uint64_t Rank = 0; +if (SameLine) { + if (L1->getAtomGroup() || L2->getAtomGroup()) { +// If we're preserving the same matching inlined-at field we can +// preserve the atom. +if (LocBIA == LocAIA && InlinedAt == LocBIA) { + // Deterministically keep the lowest non-zero ranking atom group + // number. + // FIXME: It would be nice if we could track that an instruction + // belongs to two source atoms. + bool UseL1Atom = [L1, L2]() { +if (L1->getAtomRank() == L2->getAtomRank()) { + // Arbitrarily choose the lowest non-zero group number. + if (!L1->getAtomGroup() || !L2->getAtomGroup()) +return !L2->getAtomGroup(); + return L1->getAtomGroup() < L2->getAtomGroup(); +} +// Choose the lowest non-zero rank. +if (!L1->getAtomRank() || !L2->getAtomRank()) + return !L2->getAtomRank(); +return L1->getAtomRank() < L2->getAtomRank(); + }(); + Group = UseL1Atom ? L1->getAtomGroup() : L2->getAtomGroup(); + Rank = UseL1Atom ? L1->getAtomRank() : L2->getAtomRank(); +} else { + // If either instruction is part of a source atom, reassign it a new + // atom group. This essentially regresses to non-key-instructions + // behaviour (now that it's the only instruction in its group it'll + // probably get is_stmt applied). + Group = C.incNextAtomGroup(); + Rank = 1; SLTozer wrote: This makes sense - although I still suspect there's some form of optimization we could do here (isolating the cases where atomGroups need to change), better to go with what definitely works here! https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Propagate DebugLocs on phis in BreakCriticalEdges (PR #133492)
SLTozer wrote: > I'm also not 100% sure if there's a good "policy" in place for PHI debug locs > (paging @SLTozer) In most cases we do not set debug locs on PHI nodes or expect them to have debug locs, but there are some cases where we explicitly set/check them - most often in loop optimizations, where PHIs may have source locations relevant to the loop induction variable, and in InstCombine where we perform transformations between ordinary instructions and PHIs. I don't _think_ we have a well-defined policy in place, but since they're sometimes useful it's a good rule-of-thumb to propagate them if doing so makes sense. https://github.com/llvm/llvm-project/pull/133492 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -189,11 +189,15 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) { // Merge the two locations if possible, using the supplied // inlined-at location for the created location. - auto MergeLocPair = [&C](const DILocation *L1, const DILocation *L2, - DILocation *InlinedAt) -> DILocation * { + auto *LocAIA = LocA->getInlinedAt(); + auto *LocBIA = LocB->getInlinedAt(); + auto MergeLocPair = [&C, LocAIA, + LocBIA](const DILocation *L1, const DILocation *L2, + DILocation *InlinedAt) -> DILocation * { if (L1 == L2) return DILocation::get(C, L1->getLine(), L1->getColumn(), L1->getScope(), - InlinedAt); + InlinedAt, L1->isImplicitCode(), SLTozer wrote: Technically copying `L1->isImplicitCode()` here is a change in behaviour - normally that flag would be effectively dropped here. https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [KeyInstr][Clang] Assign matrix element atom (PR #134650)
https://github.com/SLTozer approved this pull request. https://github.com/llvm/llvm-project/pull/134650 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From afeb26be5f099d384115a55b19707bbb2a730245 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 83 ++--- llvm/utils/llvm-original-di-preservation.py | 24 +++--- 2 files changed, 88 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index c2dbdc57eb3b5..460b5e50e42d7 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,52 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap> SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +unsigned VirtualFrameNo = 0; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) { +OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedFrame << '\n'; + } +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -373,6 +427,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -448,14 +504,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << N
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 4410b5f351cad4cd611cbc773337197d5fa367b8 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index 999e03b6374a5..6d79aa6b2aa01 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -187,6 +209,19 @@ namespace llvm { #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 8e1ef24226789..ef382a9168f24 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index ff9f0ff5d5bc3..3b3e7a418feb5 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoM
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 4410b5f351cad4cd611cbc773337197d5fa367b8 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index 999e03b6374a5..6d79aa6b2aa01 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -187,6 +209,19 @@ namespace llvm { #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 8e1ef24226789..ef382a9168f24 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index ff9f0ff5d5bc3..3b3e7a418feb5 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoM
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From afeb26be5f099d384115a55b19707bbb2a730245 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 83 ++--- llvm/utils/llvm-original-di-preservation.py | 24 +++--- 2 files changed, 88 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index c2dbdc57eb3b5..460b5e50e42d7 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,52 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap> SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +unsigned VirtualFrameNo = 0; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) { +OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedFrame << '\n'; + } +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -373,6 +427,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -448,14 +504,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << N
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [KeyInstr][Clang] Coerced store atoms (PR #134653)
https://github.com/SLTozer approved this pull request. https://github.com/llvm/llvm-project/pull/134653 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect symbolized stack traces (PR #143591)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)
https://github.com/SLTozer created https://github.com/llvm/llvm-project/pull/143593 None >From c6f681d4eb307ca5f8859b3e4e7605fc2fa8441c Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:21 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM --- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 +++ llvm/lib/IR/Instruction.cpp| 2 +- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoMetadata.h" #include "llvm/IR/DebugLoc.h" #include "llvm/IR/Function.h" @@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock *SuccBB, // Sort by hash value so that blocks with identical end sequences sort // together. +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + // If origin-tracking is enabled then MergePotentialElt is no longer a POD + // type, so we need std::sort instead. + std::sort(MergePotentials.begin(), MergePotentials.end()); +#else array_pod_sort(MergePotentials.begin(), MergePotentials.end()); +#endif // Walk through equivalence sets looking for actual exact matches. while (MergePotentials.size() > 1) { diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp index 109d516c61b7c..123bc7ecce01a 100644 --- a/llvm/lib/IR/Instruction.cpp +++ b/llvm/lib/IR/Instruction.cpp @@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst, setMetadata(MD.first, MD.second); } if (WL.empty() || WLS.count(LLVMContext::MD_dbg)) -setDebugLoc(SrcInst.getDebugLoc()); +setDebugLoc(SrcInst.getDebugLoc().getCopied()); } Instruction *Instruction::clone() const { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer created https://github.com/llvm/llvm-project/pull/143594 None >From 4786afd40d73ade22952ca43af1164c6f9545679 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 77 ++--- llvm/utils/llvm-original-di-preservation.py | 22 +++--- 2 files changed, 80 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..a9a66baf5571f 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,20 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ +{"metadata", "DILocation"}, {"fn-name", FnName.str()}, +{"bb-name", BBName.str()}, {"instr", InstName}, {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +{"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +533,7 @@ static bool checkInstructions(const DebugInstMap &D
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: SymbolizeAddresses (PR #143591)
https://github.com/SLTozer created https://github.com/llvm/llvm-project/pull/143591 None >From d10a102637f2dcb215039df2cb248131c6a715ce Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFil
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Core implementation (PR #143592)
https://github.com/SLTozer created https://github.com/llvm/llvm-project/pull/143592 None >From 8ff21d6e7630b0407931712eb652e0416ce661d8 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h | 62 + llvm/lib/IR/DebugLoc.cpp| 22 +++- 2 files changed, 76 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..bc890dd671a81 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,32 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +DebugLoc(DebugLocKind Kind) : Loc(Kind) {} +DebugLocKind getKind() const { return Loc.Kind; } +#endif + +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#if !LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#error Cannot enable DebugLoc origin-tracking without coverage-tracking! +#endif + +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + +static DebugLoc getTemporary(); +static DebugLoc getUnknown(); +static DebugLoc getLineZero(); + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp index 0e65ddcec8934..05aad5d393547 100644 --- a/llvm/lib/IR/DebugLoc.cpp +++ b/llvm/lib/IR/DebugLoc.cpp @@ -9,11 +9,31 @@ #include "llvm/IR/DebugLoc.h" #include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfo.h" + +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#include "llvm/Support/Signals.h" + +namespace llvm { +DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) { + if (ShouldCollectTrace) { +auto &[Depth, StackTrace] = StackTraces.emplace_back(); +Depth = sys::getStackTrace(StackTrace); + } +} +void DbgLocOrigin::addTrace() { + if (StackTraces.empty()) +return; + auto &[Depth, StackTrace] = StackTraces.emplace_back(); + Dept
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: SymbolizeAddresses (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From b2ecf5ed0da6fd3e03192ae921680b7576c12365 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: SymbolizeAddresses (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From b2ecf5ed0da6fd3e03192ae921680b7576c12365 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From c973e73b792cc1440af7c9001a0ddcfef94a9e21 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 77 ++--- llvm/utils/llvm-original-di-preservation.py | 22 +++--- 2 files changed, 80 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..a9a66baf5571f 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,20 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ +{"metadata", "DILocation"}, {"fn-name", FnName.str()}, +{"bb-name", BBName.str()}, {"instr", InstName}, {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +{"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +533,7 @@ static bool checkInstructions(const DebugInstMap &DILocsB
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From c973e73b792cc1440af7c9001a0ddcfef94a9e21 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 77 ++--- llvm/utils/llvm-original-di-preservation.py | 22 +++--- 2 files changed, 80 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..a9a66baf5571f 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,20 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ +{"metadata", "DILocation"}, {"fn-name", FnName.str()}, +{"bb-name", BBName.str()}, {"instr", InstName}, {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +{"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +533,7 @@ static bool checkInstructions(const DebugInstMap &DILocsB
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143593 >From eff0813afb187a5bba4f59d63120d9dd131a3a67 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:21 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM --- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 +++ llvm/lib/IR/Instruction.cpp| 2 +- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoMetadata.h" #include "llvm/IR/DebugLoc.h" #include "llvm/IR/Function.h" @@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock *SuccBB, // Sort by hash value so that blocks with identical end sequences sort // together. +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + // If origin-tracking is enabled then MergePotentialElt is no longer a POD + // type, so we need std::sort instead. + std::sort(MergePotentials.begin(), MergePotentials.end()); +#else array_pod_sort(MergePotentials.begin(), MergePotentials.end()); +#endif // Walk through equivalence sets looking for actual exact matches. while (MergePotentials.size() > 1) { diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp index 109d516c61b7c..123bc7ecce01a 100644 --- a/llvm/lib/IR/Instruction.cpp +++ b/llvm/lib/IR/Instruction.cpp @@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst, setMetadata(MD.first, MD.second); } if (WL.empty() || WLS.count(LLVMContext::MD_dbg)) -setDebugLoc(SrcInst.getDebugLoc()); +setDebugLoc(SrcInst.getDebugLoc().getCopied()); } Instruction *Instruction::clone() const { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Core implementation (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 6ade803aa6c7e0137e4e572d379238a9d1fc202e Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h | 49 - llvm/lib/IR/DebugLoc.cpp| 22 ++- 2 files changed, 63 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..1930199607204 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,19 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp index 0e65ddcec8934..05aad5d393547 100644 --- a/llvm/lib/IR/DebugLoc.cpp +++ b/llvm/lib/IR/DebugLoc.cpp @@ -9,11 +9,31 @@ #include "llvm/IR/DebugLoc.h" #include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfo.h" + +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#include "llvm/Support/Signals.h" + +namespace llvm { +DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) { + if (ShouldCollectTrace) { +auto &[Depth, StackTrace] = StackTraces.emplace_back(); +Depth = sys::getStackTrace(StackTrace); + } +} +void DbgLocOrigin::addTrace() { + if (StackTraces.empty()) +return; + auto &[Depth, StackTrace] = StackTraces.emplace_back(); + Depth = sys::getStackTrace(StackTrace); +} +} // namespace llvm +#endif + using namespace llvm; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L) -: TrackingMDNodeRef(const_cast(L)), +: TrackingMDNodeRef(const_cast(L)), DbgLocOrigin(!L), Kind(DebugLocKind::Normal) {} #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Core implementation (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 6ade803aa6c7e0137e4e572d379238a9d1fc202e Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h | 49 - llvm/lib/IR/DebugLoc.cpp| 22 ++- 2 files changed, 63 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..1930199607204 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,19 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp index 0e65ddcec8934..05aad5d393547 100644 --- a/llvm/lib/IR/DebugLoc.cpp +++ b/llvm/lib/IR/DebugLoc.cpp @@ -9,11 +9,31 @@ #include "llvm/IR/DebugLoc.h" #include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfo.h" + +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#include "llvm/Support/Signals.h" + +namespace llvm { +DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) { + if (ShouldCollectTrace) { +auto &[Depth, StackTrace] = StackTraces.emplace_back(); +Depth = sys::getStackTrace(StackTrace); + } +} +void DbgLocOrigin::addTrace() { + if (StackTraces.empty()) +return; + auto &[Depth, StackTrace] = StackTraces.emplace_back(); + Depth = sys::getStackTrace(StackTrace); +} +} // namespace llvm +#endif + using namespace llvm; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L) -: TrackingMDNodeRef(const_cast(L)), +: TrackingMDNodeRef(const_cast(L)), DbgLocOrigin(!L), Kind(DebugLocKind::Normal) {} #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)
https://github.com/SLTozer closed https://github.com/llvm/llvm-project/pull/143593 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143594 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer ready_for_review https://github.com/llvm/llvm-project/pull/143594 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer ready_for_review https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143594 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From 4bbd28b23847c069445d9babe9aa8a8aac5036c1 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 80 ++--- llvm/utils/llvm-original-di-preservation.py | 24 --- 2 files changed, 85 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..01ed9de51c0b2 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +536,7 @@ static bool checkInstructi
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From 12f5a10c1dc2ae6943947c85a5bd05a295ae1c7c Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 5e5629149de6f5929a4a1a1986281a201046fd01 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..1930199607204 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,19 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From f47991b0264f1fbf14e93941e7e9398d4e8e0ae3 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From e46273bec027d0accfbe6d3de9880c29977c6858 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 80 ++--- llvm/utils/llvm-original-di-preservation.py | 24 --- 2 files changed, 85 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..01ed9de51c0b2 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +536,7 @@ static bool checkInstructi
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From e46273bec027d0accfbe6d3de9880c29977c6858 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 80 ++--- llvm/utils/llvm-original-di-preservation.py | 24 --- 2 files changed, 85 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..01ed9de51c0b2 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +536,7 @@ static bool checkInstructi
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From f47991b0264f1fbf14e93941e7e9398d4e8e0ae3 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 5e5629149de6f5929a4a1a1986281a201046fd01 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..1930199607204 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,19 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,2348 @@ +//===-- LVIRReader.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This implements the LVIRReader class. +// It supports LLVM text IR and bitcode format. +// +//===--===// + +#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h" +#include "llvm/CodeGen/DebugHandlerBase.h" +#include "llvm/DebugInfo/LogicalView/Core/LVLine.h" +#include "llvm/DebugInfo/LogicalView/Core/LVScope.h" +#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h" +#include "llvm/DebugInfo/LogicalView/Core/LVType.h" +#include "llvm/IR/DebugInfoMetadata.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/Module.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Object/Error.h" +#include "llvm/Object/IRObjectFile.h" +#include "llvm/Support/FormatAdapters.h" +#include "llvm/Support/FormatVariadic.h" +#include "llvm/Support/SourceMgr.h" + +using namespace llvm; +using namespace llvm::object; +using namespace llvm::logicalview; + +#define DEBUG_TYPE "IRReader" + +// Extra debug traces. Default is false +#define DEBUG_ALL + +// These flavours of DINodes are not handled: +// DW_TAG_APPLE_property = 19896 +// DW_TAG_atomic_type = 71 +// DW_TAG_common_block = 26 +// DW_TAG_file_type= 41 +// DW_TAG_friend = 42 +// DW_TAG_generic_subrange = 69 +// DW_TAG_immutable_type = 75 +// DW_TAG_module = 30 + +// Create a logical element and setup the following information: +// - Name, DWARF tag, line +// - Collect any file information +LVElement *LVIRReader::constructElement(const DINode *DN) { + dwarf::Tag Tag = DN->getTag(); + LVElement *Element = createElement(Tag); + if (Element) { +Element->setTag(Tag); +addMD(DN, Element); + +StringRef Name = getMDName(DN); +if (!Name.empty()) + Element->setName(Name); + +// Record any file information. +if (const DIFile *File = getMDFile(DN)) + getOrCreateSourceID(File); + } + + return Element; +} + +void LVIRReader::mapFortranLanguage(unsigned DWLang) { + switch (DWLang) { + case dwarf::DW_LANG_Fortran77: + case dwarf::DW_LANG_Fortran90: + case dwarf::DW_LANG_Fortran95: + case dwarf::DW_LANG_Fortran03: + case dwarf::DW_LANG_Fortran08: + case dwarf::DW_LANG_Fortran18: +LanguageIsFortran = true; +break; + default: +LanguageIsFortran = false; + } +} + +// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only +// difference is setting the 'DICompileUnit::splitDebugFilename' to the +// name of the split filename: "xxx.dwo". +bool LVIRReader::includeMinimalInlineScopes() const { + return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly; +} + +// For the given 'DIFile' generate an index 1-based to indicate the +// source file where the logical element is declared. +// In DWARF v4, the files are 1-indexed. +// In DWARF v5, the files are 0-indexed. +// The IR reader expects the indexes as 1-indexed. +// Each compile unit, keeps track of the last assigned index. +size_t LVIRReader::getOrCreateSourceID(const DIFile *File) { + if (!File) +return 0; + +#ifdef DEBUG_ALL + LLVM_DEBUG({ +dbgs() << "\n[getOrCreateSourceID] DIFile\n"; +File->dump(); + }); +#endif + + addMD(File, CompileUnit); + + LLVM_DEBUG({ +dbgs() << "Directory: '" << File->getDirectory() << "'\n"; +dbgs() << "Filename: '" << File->getFilename() << "'\n"; + }); + size_t FileIndex = 0; + LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File); + if (Iter == CompileUnitFiles.cend()) { +FileIndex = getFileIndex(CompileUnit); +std::string Directory(File->getDirectory()); +if (Directory.empty()) + Directory = std::string(CompileUnit->getCompilationDirectory()); + +std::string FullName; +raw_string_ostream Out(FullName); +Out << Directory << "/" << llvm::sys::path::filename(File->getFilename()); +CompileUnit->addFilename(transformPath(FullName)); +CompileUnitFiles.emplace(File, ++FileIndex); +updateFileIndex(CompileUnit, FileIndex); + } else { +FileIndex = Iter->second; + } + + LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; }); + return FileIndex; +} + +void LVIRReader::addSourceLine(LVElement *Element, unsigned Line, + const DIFile *File) { + if (Line == 0) +return; + + // After the scopes are created, the generic reader traverses the 'Children' + // and perform additional setting tasks (resolve types names, references, + // etc.). One of those tasks is select the correct string pool index based on + // the commmand line o
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); + sys::fs::createTemporaryFile("symbolizer-output", "", OutputFile); + FileRemover InputRemover(InputFile.c_str()); + FileRemover OutputRemover(OutputFile.c_str()); + + { +raw_fd_ostream Input(InputFD, true); +for (int i = 0; i < NumAddresses; i++) { + if (Modules[i]) +Input << Modules[i] << " " << (void *)Offsets[i] << "\n"; +} + } + + std::optional Redirects[] = {InputFile.str(), OutputFile.str(), + StringRef("")}; + StringRef Args[] = {"llvm-symbolizer", "--functions=linkage", "--inlining", +#ifdef _WIN32 + // Pass --relative-address on Windows so that we don't + // have to add ImageBase from PE file. + // FIXME: Make this the default for llvm-symbolizer. + "--relative-address", +#endif SLTozer wrote: We don't need it for now, but since this is copied from the existing invocation of the symbolizer (and I'm currently looking at extracting this and a few other parts out), it's more effort to remove support for Windows than to keep it; I do intend to add Windows support later on, it's just trickier than using `backtrace()` and not an urgent feature. https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
SLTozer wrote: > A bigger question though is whether this can be tested. That's a good question - in theory yes, but it sounds brittle. Assuming we're testing the output of backtrace/symbolize, we'd have a test that could fail if any of a variety of function names changed. Another argument against testing is that this code exists to generate formatted, human readable output - most of the complicated parts are in `llvm-symbolizer` and `libbacktrace`, while these functions are essentially just invoking them and formatting the output. All the same, testing is generally good; I'll consider how would be best to test this, but I will also happily take suggestions if you/others have any. https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer ready_for_review https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143593 >From eff0813afb187a5bba4f59d63120d9dd131a3a67 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:21 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM --- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 +++ llvm/lib/IR/Instruction.cpp| 2 +- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoMetadata.h" #include "llvm/IR/DebugLoc.h" #include "llvm/IR/Function.h" @@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock *SuccBB, // Sort by hash value so that blocks with identical end sequences sort // together. +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + // If origin-tracking is enabled then MergePotentialElt is no longer a POD + // type, so we need std::sort instead. + std::sort(MergePotentials.begin(), MergePotentials.end()); +#else array_pod_sort(MergePotentials.begin(), MergePotentials.end()); +#endif // Walk through equivalence sets looking for actual exact matches. while (MergePotentials.size() > 1) { diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp index 109d516c61b7c..123bc7ecce01a 100644 --- a/llvm/lib/IR/Instruction.cpp +++ b/llvm/lib/IR/Instruction.cpp @@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst, setMetadata(MD.first, MD.second); } if (WL.empty() || WLS.count(LLVMContext::MD_dbg)) -setDebugLoc(SrcInst.getDebugLoc()); +setDebugLoc(SrcInst.getDebugLoc().getCopied()); } Instruction *Instruction::clone() const { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. SLTozer wrote: The purpose of this is simply to prevent people from accidentally shipping any release compiler with this code compiled in. https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
@@ -507,6 +507,21 @@ static int dl_iterate_phdr_cb(dl_phdr_info *info, size_t size, void *arg) { return 0; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#if !defined(HAVE_BACKTRACE) +#error DebugLoc origin-tracking currently requires `backtrace()`. +#endif SLTozer wrote: Possibly, I'll look into it - though if possible I'd prefer to replace the use of `backtrace()` with in-program stack traversal, which would require `-fno-omit-frame-pointer` (I'm not sure how you'd check that in CMake either) but would be significantly faster than making the lib call, so if it turns out to be non-trivial (e.g. if there's some ordering issue in the CMake config) then it may not be worth it. https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); + sys::fs::createTemporaryFile("symbolizer-output", "", OutputFile); + FileRemover InputRemover(InputFile.c_str()); + FileRemover OutputRemover(OutputFile.c_str()); + + { +raw_fd_ostream Input(InputFD, true); +for (int i = 0; i < NumAddresses; i++) { + if (Modules[i]) +Input << Modules[i] << " " << (void *)Offsets[i] << "\n"; +} + } + + std::optional Redirects[] = {InputFile.str(), OutputFile.str(), + StringRef("")}; + StringRef Args[] = {"llvm-symbolizer", "--functions=linkage", "--inlining", +#ifdef _WIN32 + // Pass --relative-address on Windows so that we don't + // have to add ImageBase from PE file. + // FIXME: Make this the default for llvm-symbolizer. + "--relative-address", +#endif + "--demangle"}; + int RunResult = + sys::ExecuteAndWait(LLVMSymbolizerPath, Args, std::nullopt, Redirects); + if (RunResult != 0) +return; + + // This report format is based on the sanitizer stack trace printer. See + // sanitizer_stacktrace_printer.cc in compiler-rt. + auto OutputBuf = MemoryBuffer::getFile(OutputFile.c_str()); + if (!OutputBuf) +return; + StringRef Output = OutputBuf.get()->getBuffer(); + SmallVector Lines; + Output.split(Lines, "\n"); + auto CurLine = Lines.begin(); + for (int i = 0; i < NumAddresses; i++) { +assert(!SymbolizedAddresses.contains(AddressList[i])); +std::string &SymbolizedAddr = SymbolizedAddresses[AddressList[i]]; +raw_string_ostream OS(SymbolizedAddr); +if (!Modules[i]) { + OS << format_ptr(AddressList[i]) << '\n'; + continue; +} +// Read pairs of lines (function name and file/line info) until we +// encounter empty line. +for (bool IsFirst = true;; IsFirst = false) { SLTozer wrote: I wouldn't exactly say so - we're iterating over the call stack, which means iterating over the real stack frames, and for each real stack frame iterating over all the inlined calls at that frame's PC, so I would say that we're iterating linearly over the set of real+inlined frames. https://github.com/llvm/llvm-project/pull/143591 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From ef6ccda96703764bbed694f910d56d8a3af27730 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 80 ++--- llvm/utils/llvm-original-di-preservation.py | 24 --- 2 files changed, 85 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..01ed9de51c0b2 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +536,7 @@ static bool checkInstructi
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 2ff6e13069844c443ce8ff5677b3930e970665cf Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..1930199607204 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,19 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From 622d1fb6df403dc9457b42c9d8f70b8004eb06a5 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From ef6ccda96703764bbed694f910d56d8a3af27730 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 80 ++--- llvm/utils/llvm-original-di-preservation.py | 24 --- 2 files changed, 85 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 729813a92f516..01ed9de51c0b2 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,49 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedAddrs[StackTrace[Frame]]; +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << NameOfWrappedPass << " did not generate DILocation for " << *Instr @@ -474,11 +536,7 @@ static bool checkInstructi
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143591 >From 622d1fb6df403dc9457b42c9d8f70b8004eb06a5 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 19:58:09 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses --- llvm/include/llvm/Support/Signals.h | 40 + llvm/lib/Support/Signals.cpp | 116 +++ llvm/lib/Support/Unix/Signals.inc| 15 llvm/lib/Support/Windows/Signals.inc | 5 ++ 4 files changed, 176 insertions(+) diff --git a/llvm/include/llvm/Support/Signals.h b/llvm/include/llvm/Support/Signals.h index 6ce26acdd458e..a6f99d8bbdc95 100644 --- a/llvm/include/llvm/Support/Signals.h +++ b/llvm/include/llvm/Support/Signals.h @@ -14,7 +14,9 @@ #ifndef LLVM_SUPPORT_SIGNALS_H #define LLVM_SUPPORT_SIGNALS_H +#include "llvm/Config/llvm-config.h" #include "llvm/Support/Compiler.h" +#include #include #include @@ -22,6 +24,22 @@ namespace llvm { class StringRef; class raw_ostream; +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +// Typedefs that are convenient but only used by the stack-trace-collection code +// added if DebugLoc origin-tracking is enabled. +template struct DenseMapInfo; +template class DenseSet; +namespace detail { +template struct DenseMapPair; +} +template +class DenseMap; +using AddressSet = DenseSet>; +using SymbolizedAddressMap = +DenseMap, + detail::DenseMapPair>; +#endif + namespace sys { /// This function runs all the registered interrupt handlers, including the @@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash(); ///specified, the entire frame is printed. LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0); +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +#ifdef NDEBUG +#error DebugLoc origin-tracking should not be enabled in Release builds. +#endif +/// Populates the given array with a stack trace of the current program, up to +/// MaxDepth frames. Returns the number of frames returned, which will be +/// inserted into \p StackTrace from index 0. All entries after the returned +/// depth will be unmodified. NB: This is only intended to be used for +/// introspection of LLVM by Debugify, will not be enabled in release builds, +/// and should not be relied on for other purposes. +template +int getStackTrace(std::array &StackTrace); + +/// Takes a set of \p Addresses, symbolizes them and stores the result in the +/// provided \p SymbolizedAddresses map. +/// NB: This is only intended to be used for introspection of LLVM by +/// Debugify, will not be enabled in release builds, and should not be relied +/// on for other purposes. +void symbolizeAddresses(AddressSet &Addresses, +SymbolizedAddressMap &SymbolizedAddresses); +#endif + // Run all registered signal handlers. LLVM_ABI void RunSignalHandlers(); diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp index 9f9030e79d104..50b0d6e78ddd1 100644 --- a/llvm/lib/Support/Signals.cpp +++ b/llvm/lib/Support/Signals.cpp @@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, void **StackTrace, return true; } +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +void sys::symbolizeAddresses(AddressSet &Addresses, + SymbolizedAddressMap &SymbolizedAddresses) { + assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) && + "Debugify origin stacktraces require symbolization to be enabled."); + + // Convert Set of Addresses to ordered list. + SmallVector AddressList(Addresses.begin(), Addresses.end()); + if (AddressList.empty()) +return; + int NumAddresses = AddressList.size(); + llvm::sort(AddressList); + + // Use llvm-symbolizer tool to symbolize the stack traces. First look for it + // alongside our binary, then in $PATH. + ErrorOr LLVMSymbolizerPathOrErr = std::error_code(); + if (const char *Path = getenv(LLVMSymbolizerPathEnv)) { +LLVMSymbolizerPathOrErr = sys::findProgramByName(Path); + } + if (!LLVMSymbolizerPathOrErr) +LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer"); + assert(!!LLVMSymbolizerPathOrErr && + "Debugify origin stacktraces require llvm-symbolizer."); + const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr; + + // Try to guess the main executable name, since we don't have argv0 available + // here. + std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, nullptr); + + BumpPtrAllocator Allocator; + StringSaver StrPool(Allocator); + std::vector Modules(NumAddresses, nullptr); + std::vector Offsets(NumAddresses, 0); + if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(), + Offsets.data(), MainExecutableName.c_str(), + StrPool)) +return; + int InputFD; + SmallString<32> InputFile, OutputFile; + sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile); +
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From 2ff6e13069844c443ce8ff5677b3930e970665cf Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index c3d0fb80354a4..1930199607204 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -142,6 +164,19 @@ namespace llvm { static inline DebugLoc getDropped() { return DebugLoc(); } #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 10fc9c1298607..1d22bdb0c3f43 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index e0f7466ceacff..47fc0ec7549e0 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
SLTozer wrote: New PR: https://github.com/llvm/llvm-project/pull/146678 https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143594 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
SLTozer wrote: > Are you planning to extend documentation Yes, and it probably is best if the documentation lands in this patch! https://github.com/llvm/llvm-project/pull/143594 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From e2ff01bc95a78c4372bdf538f0433dc882c070f8 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 83 ++--- llvm/utils/llvm-original-di-preservation.py | 24 +++--- 2 files changed, 88 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 5f70bc442d2f0..e8ed55a99546e 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,52 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap> SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +unsigned VirtualFrameNo = 0; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) { +OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedFrame << '\n'; + } +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -375,6 +429,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -450,14 +506,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << N
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From fb65cb7f043586eb6808b229fd1ad018ffd7571d Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index 999e03b6374a5..6d79aa6b2aa01 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -187,6 +209,19 @@ namespace llvm { #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 8e1ef24226789..ef382a9168f24 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index ff9f0ff5d5bc3..3b3e7a418feb5 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoM
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143594 >From e2ff01bc95a78c4372bdf538f0433dc882c070f8 Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:02:36 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support --- llvm/lib/Transforms/Utils/Debugify.cpp | 83 ++--- llvm/utils/llvm-original-di-preservation.py | 24 +++--- 2 files changed, 88 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp index 5f70bc442d2f0..e8ed55a99546e 100644 --- a/llvm/lib/Transforms/Utils/Debugify.cpp +++ b/llvm/lib/Transforms/Utils/Debugify.cpp @@ -15,7 +15,10 @@ #include "llvm/Transforms/Utils/Debugify.h" #include "llvm/ADT/BitVector.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/Config/config.h" #include "llvm/IR/DIBuilder.h" #include "llvm/IR/DebugInfo.h" #include "llvm/IR/InstIterator.h" @@ -28,6 +31,11 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/JSON.h" #include +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// We need the Signals header to operate on stacktraces if we're using DebugLoc +// origin-tracking. +#include "llvm/Support/Signals.h" +#endif #define DEBUG_TYPE "debugify" @@ -59,6 +67,52 @@ cl::opt DebugifyLevel( raw_ostream &dbg() { return Quiet ? nulls() : errs(); } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +// These maps refer to addresses in this instance of LLVM, so we can reuse them +// everywhere - therefore, we store them at file scope. +static DenseMap> SymbolizedAddrs; +static DenseSet UnsymbolizedAddrs; + +std::string symbolizeStackTrace(const Instruction *I) { + // We flush the set of unsymbolized addresses at the latest possible moment, + // i.e. now. + if (!UnsymbolizedAddrs.empty()) { +sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs); +UnsymbolizedAddrs.clear(); + } + auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces(); + std::string Result; + raw_string_ostream OS(Result); + for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) { +if (TraceIdx != 0) + OS << "\n"; +auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx]; +unsigned VirtualFrameNo = 0; +for (int Frame = 0; Frame < Depth; ++Frame) { + assert(SymbolizedAddrs.contains(StackTrace[Frame]) && + "Expected each address to have been symbolized."); + for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) { +OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), std::log10(Depth) + 2) + << ' ' << SymbolizedFrame << '\n'; + } +} + } + return Result; +} +void collectStackAddresses(Instruction &I) { + auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces(); + for (auto &[Depth, StackTrace] : OriginStackTraces) { +for (int Frame = 0; Frame < Depth; ++Frame) { + void *Addr = StackTrace[Frame]; + if (!SymbolizedAddrs.contains(Addr)) +UnsymbolizedAddrs.insert(Addr); +} + } +} +#else +void collectStackAddresses(Instruction &I) {} +#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + uint64_t getAllocSizeInBits(Module &M, Type *Ty) { return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0; } @@ -375,6 +429,8 @@ bool llvm::collectDebugInfoMetadata(Module &M, LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n'); DebugInfoBeforePass.InstToDelete.insert({&I, &I}); +// Track the addresses to symbolize, if the feature is enabled. +collectStackAddresses(I); DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)}); } } @@ -450,14 +506,23 @@ static bool checkInstructions(const DebugInstMap &DILocsBefore, auto BBName = BB->hasName() ? BB->getName() : "no-name"; auto InstName = Instruction::getOpcodeName(Instr->getOpcode()); +auto CreateJSONBugEntry = [&](const char *Action) { + Bugs.push_back(llvm::json::Object({ + {"metadata", "DILocation"}, + {"fn-name", FnName.str()}, + {"bb-name", BBName.str()}, + {"instr", InstName}, + {"action", Action}, +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + {"origin", symbolizeStackTrace(Instr)}, +#endif + })); +}; + auto InstrIt = DILocsBefore.find(Instr); if (InstrIt == DILocsBefore.end()) { if (ShouldWriteIntoJSON) -Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"}, - {"fn-name", FnName.str()}, - {"bb-name", BBName.str()}, - {"instr", InstName}, - {"action", "not-generate"}})); +CreateJSONBugEntry("not-generate"); else dbg() << "WARNING: " << N
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/143592 >From fb65cb7f043586eb6808b229fd1ad018ffd7571d Mon Sep 17 00:00:00 2001 From: Stephen Tozer Date: Tue, 10 Jun 2025 20:00:51 +0100 Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation --- llvm/include/llvm/IR/DebugLoc.h| 49 +- llvm/include/llvm/IR/Instruction.h | 2 +- llvm/lib/CodeGen/BranchFolding.cpp | 7 + llvm/lib/IR/DebugLoc.cpp | 22 +- 4 files changed, 71 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h index 999e03b6374a5..6d79aa6b2aa01 100644 --- a/llvm/include/llvm/IR/DebugLoc.h +++ b/llvm/include/llvm/IR/DebugLoc.h @@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; +StackTracesTy StackTraces; +DbgLocOrigin(bool ShouldCollectTrace); +void addTrace(); +const StackTracesTy &getOriginStackTraces() const { return StackTraces; }; + }; +#else + struct DbgLocOrigin { +DbgLocOrigin(bool) {} + }; +#endif // Used to represent different "kinds" of DebugLoc, expressing that the // instruction it is part of is either normal and should contain a valid // DILocation, or otherwise describing the reason why the instruction does @@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { public: DebugLocKind Kind; // Default constructor for empty DebugLocs. DILocAndCoverageTracking() -: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {} -// Valid or nullptr MDNode*, normal DebugLocKind. +: TrackingMDNodeRef(nullptr), DbgLocOrigin(true), + Kind(DebugLocKind::Normal) {} +// Valid or nullptr MDNode*, no annotative DebugLocKind. DILocAndCoverageTracking(const MDNode *Loc) -: TrackingMDNodeRef(const_cast(Loc)), +: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc), Kind(DebugLocKind::Normal) {} LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc); // Explicit DebugLocKind, which always means a nullptr MDNode*. DILocAndCoverageTracking(DebugLocKind Kind) -: TrackingMDNodeRef(nullptr), Kind(Kind) {} +: TrackingMDNodeRef(nullptr), + DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {} }; template <> struct simplify_type { using SimpleType = MDNode *; @@ -187,6 +209,19 @@ namespace llvm { #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE } +#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN +const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const { + return Loc.getOriginStackTraces(); +} +DebugLoc getCopied() const { + DebugLoc NewDL = *this; + NewDL.Loc.addTrace(); + return NewDL; +} +#else +DebugLoc getCopied() const { return *this; } +#endif + /// Get the underlying \a DILocation. /// /// \pre !*this or \c isa(getAsMDNode()). diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h index 8e1ef24226789..ef382a9168f24 100644 --- a/llvm/include/llvm/IR/Instruction.h +++ b/llvm/include/llvm/IR/Instruction.h @@ -507,7 +507,7 @@ class Instruction : public User, LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const; /// Set the debug location information for this instruction. - void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); } + void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); } /// Return the debug location for this node as a DebugLoc. const DebugLoc &getDebugLoc() const { return DbgLoc; } diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp index ff9f0ff5d5bc3..3b3e7a418feb5 100644 --- a/llvm/lib/CodeGen/BranchFolding.cpp +++ b/llvm/lib/CodeGen/BranchFolding.cpp @@ -42,6 +42,7 @@ #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/CodeGen/TargetSubtargetInfo.h" +#include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoM
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
SLTozer wrote: Clicked the wrong button and accidentally merged the wrong branch (fortunately, this just merged into another PR branch, not main) - will reopen imminently, as github apparently won't allow me to reopen this PR in-place! https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
@@ -27,6 +27,21 @@ namespace llvm { class Function; #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING +#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING + struct DbgLocOrigin { +static constexpr unsigned long MaxDepth = 16; +using StackTracesTy = +SmallVector>, 0>; SLTozer wrote: Most of the time we store 0 stacktraces, but when we do store a stacktrace there may be any number of them - the `addTrace` function adds a new stacktrace to the vector, and is used whenever the DebugLoc is "transferred", so that we can track the motion of a missing debug location through the program. https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
@@ -55,22 +70,29 @@ namespace llvm { Temporary }; - // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify - // to ignore intentionally-empty DebugLocs. - class DILocAndCoverageTracking : public TrackingMDNodeRef { + // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin, + // allowing Debugify to ignore intentionally-empty DebugLocs and display the + // code responsible for generating unintentionally-empty DebugLocs. + // Currently we only need to track the Origin of this DILoc when using a + // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a + // null DILocation, so only collect the origin stacktrace in those cases. + class DILocAndCoverageTracking : public TrackingMDNodeRef, + public DbgLocOrigin { SLTozer wrote: We could manage without it, but the reason I use inheritance here is that this allows `DILocAndCoverageTracking` to automatically have the same public functions as `DbgLocOrigin`, meaning we conditionally have the origin-tracking functions enabled without having to repeat ourselves here. https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)
https://github.com/SLTozer closed https://github.com/llvm/llvm-project/pull/143592 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)
https://github.com/SLTozer edited https://github.com/llvm/llvm-project/pull/143594 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits