[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Simplify spliceDebugInfo, fixing splice-to-end edge case (#105670) (PR #106690)

2024-08-30 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.

This is a straightforward bugfix.

https://github.com/llvm/llvm-project/pull/106690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-08 Thread Stephen Tozer via llvm-branch-commits


@@ -226,8 +230,44 @@ DILocation *DILocation::getMergedLocation(DILocation 
*LocA, DILocation *LocB) {
 bool SameCol = L1->getColumn() == L2->getColumn();
 unsigned Line = SameLine ? L1->getLine() : 0;
 unsigned Col = SameLine && SameCol ? L1->getColumn() : 0;
-
-return DILocation::get(C, Line, Col, Scope, InlinedAt);
+bool IsImplicitCode = L1->isImplicitCode() && L2->isImplicitCode();
+uint64_t Group = 0;
+uint64_t Rank = 0;
+if (SameLine) {
+  if (L1->getAtomGroup() || L2->getAtomGroup()) {
+// If we're preserving the same matching inlined-at field we can
+// preserve the atom.
+if (LocBIA == LocAIA && InlinedAt == LocBIA) {
+  // Deterministically keep the lowest non-zero ranking atom group
+  // number.
+  // FIXME: It would be nice if we could track that an instruction
+  // belongs to two source atoms.
+  bool UseL1Atom = [L1, L2]() {
+if (L1->getAtomRank() == L2->getAtomRank()) {
+  // Arbitrarily choose the lowest non-zero group number.
+  if (!L1->getAtomGroup() || !L2->getAtomGroup())
+return !L2->getAtomGroup();
+  return L1->getAtomGroup() < L2->getAtomGroup();
+}
+// Choose the lowest non-zero rank.
+if (!L1->getAtomRank() || !L2->getAtomRank())
+  return !L2->getAtomRank();
+return L1->getAtomRank() < L2->getAtomRank();
+  }();
+  Group = UseL1Atom ? L1->getAtomGroup() : L2->getAtomGroup();
+  Rank = UseL1Atom ? L1->getAtomRank() : L2->getAtomRank();
+} else {
+  // If either instruction is part of a source atom, reassign it a new
+  // atom group. This essentially regresses to non-key-instructions
+  // behaviour (now that it's the only instruction in its group it'll
+  // probably get is_stmt applied).
+  Group = C.incNextAtomGroup();
+  Rank = 1;

SLTozer wrote:

Is this necessary? Since we use `inlinedAt` as part of the tuple alongside 
`atomGroup`, keeping the group the same would still result in the merged 
instruction becoming part of a distinct "group" (with `is_stmt` likely 
applying). Likewise, since we're creating a new group it sounds to me like it 
would be unnecessary to change the rank?

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/133485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits


@@ -2145,6 +2145,13 @@ class DILocation : public MDNode {
 return 0;
   }
 
+  const DILocation *getOrCloneWithoutAtom() const {

SLTozer wrote:

I think this could just be "getWithoutAtom", it's already implied with 
DIMetadata that "get" means "find me an existing metadata or create a new one".

https://github.com/llvm/llvm-project/pull/133485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.

I think conceptually there is some space for copying atom group/rank to the 
inlined instructions, giving the instruction(s) that produce the return value 
(if any) the highest precedence. This would be a separate feature however, and 
this behaviour seems fine to me as a first pass; minor inline comment.

https://github.com/llvm/llvm-project/pull/133485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits


@@ -1182,6 +1187,19 @@ static void 
cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
   U.set(NewBonusInst);
 }
   }
+
+  // Key Instructions: We may have propagated atom info into the pred. If the
+  // pred's terminator already has atom info do nothing as merging would drop
+  // one atom group anyway. If it doesn't, propagte the remapped atom group

SLTozer wrote:

```suggestion
  // one atom group anyway. If it doesn't, propagate the remapped atom group
```

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits


@@ -3609,11 +3609,11 @@ foldCondBranchOnValueKnownInPredecessorImpl(BranchInst 
*BI, DomTreeUpdater *DTU,
 N->setName(BBI->getName() + ".c");
 
   // Update operands due to translation.
-  for (Use &Op : N->operands()) {
-DenseMap::iterator PI = TranslateMap.find(Op);
-if (PI != TranslateMap.end())
-  Op = PI->second;
-  }
+  // Key Instructions: Remap all the atom groups.
+  if (const DebugLoc &DL = BBI->getDebugLoc())
+mapAtomInstance(DL, TranslateMap);
+  RemapInstruction(N, TranslateMap,
+   RF_IgnoreMissingLocals | RF_NoModuleLevelChanges);

SLTozer wrote:

If I understand right, `RemapInstruction` with these operands will never create 
a new value mapping, only ever return an existing mapping - is that correct?

https://github.com/llvm/llvm-project/pull/133484
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/133484
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)

2025-04-08 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/133481
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)

2025-04-08 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.

LGTM, besides a couple inline comments.

https://github.com/llvm/llvm-project/pull/133481
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)

2025-04-08 Thread Stephen Tozer via llvm-branch-commits


@@ -0,0 +1,53 @@
+; RUN: opt %s -passes=inline -S -o - | FileCheck %s
+
+;; Inline `f` into `g`. The inlined assignment store and add should retain
+;; their atom info.
+
+; CHECK: _Z1gi
+; CHECK-NOT: _Z1fi
+; CHECK: %add.i = add nsw i32 %mul.i, 1, !dbg [[G1R2:!.*]]
+; CHECK-NEXT: store i32 %add.i, ptr %x.i, align 4, !dbg [[G1R1:!.*]]
+
+; CHECK: [[G1R2]] = !DILocation({{.*}}, atomGroup: 1, atomRank: 2)
+; CHECK: [[G1R1]] = !DILocation({{.*}}, atomGroup: 1, atomRank: 1)
+
+define hidden void @_Z1fi(i32 noundef %a) !dbg !11 {
+entry:
+  %a.addr = alloca i32, align 4
+  %x = alloca i32, align 4
+  store i32 %a, ptr %a.addr, align 4
+  %0 = load i32, ptr %a.addr, align 4, !dbg !18
+  %mul = mul nsw i32 %0, 2, !dbg !18
+  %add = add nsw i32 %mul, 1, !dbg !19
+  store i32 %add, ptr %x, align 4, !dbg !20
+  ret void, !dbg !22
+}
+
+define hidden void @_Z1gi(i32 noundef %b) !dbg !23 {
+entry:
+  %b.addr = alloca i32, align 4
+  store i32 %b, ptr %b.addr, align 4
+  %0 = load i32, ptr %b.addr, align 4, !dbg !24
+  call void @_Z1fi(i32 noundef %0), !dbg !24
+  ret void, !dbg !25

SLTozer wrote:

Nit, could remove DILocations from the instructions that aren't relevant to the 
test.

https://github.com/llvm/llvm-project/pull/133481
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Inline atom info (PR #133481)

2025-04-08 Thread Stephen Tozer via llvm-branch-commits


@@ -1813,7 +1813,8 @@ static DebugLoc inlineDebugLoc(DebugLoc OrigDL, 
DILocation *InlinedAt,
DenseMap &IANodes) {
   auto IA = DebugLoc::appendInlinedAt(OrigDL, InlinedAt, Ctx, IANodes);
   return DILocation::get(Ctx, OrigDL.getLine(), OrigDL.getCol(),
- OrigDL.getScope(), IA);
+ OrigDL.getScope(), IA, OrigDL.isImplicitCode(),

SLTozer wrote:

Similar to my comment on a different review, propagating `OrigDL`'s 
`IsImplicitCode` field is a change in behaviour.

https://github.com/llvm/llvm-project/pull/133481
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][debugify] Add --debugify-atoms to add key instructions metadata (PR #133483)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.


https://github.com/llvm/llvm-project/pull/133483
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.

Some minor nits, but this update looks correct.

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-09 Thread Stephen Tozer via llvm-branch-commits


@@ -1182,6 +1187,19 @@ static void 
cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
   U.set(NewBonusInst);
 }
   }
+
+  // Key Instructions: We may have propagated atom info into the pred. If the
+  // pred's terminator already has atom info do nothing as merging would drop
+  // one atom group anyway. If it doesn't, propagte the remapped atom group
+  // from BB's terminator.
+  if (auto &PredDL = PredBlock->getTerminator()->getDebugLoc()) {

SLTozer wrote:

```suggestion
  if (auto &PredDL = PTI->getDebugLoc()) {
```
If I understand it, `PTI` is still `PredBlock`'s terminator?

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-05-06 Thread Stephen Tozer via llvm-branch-commits


@@ -226,8 +230,44 @@ DILocation *DILocation::getMergedLocation(DILocation 
*LocA, DILocation *LocB) {
 bool SameCol = L1->getColumn() == L2->getColumn();
 unsigned Line = SameLine ? L1->getLine() : 0;
 unsigned Col = SameLine && SameCol ? L1->getColumn() : 0;
-
-return DILocation::get(C, Line, Col, Scope, InlinedAt);
+bool IsImplicitCode = L1->isImplicitCode() && L2->isImplicitCode();
+uint64_t Group = 0;
+uint64_t Rank = 0;
+if (SameLine) {
+  if (L1->getAtomGroup() || L2->getAtomGroup()) {
+// If we're preserving the same matching inlined-at field we can
+// preserve the atom.
+if (LocBIA == LocAIA && InlinedAt == LocBIA) {
+  // Deterministically keep the lowest non-zero ranking atom group
+  // number.
+  // FIXME: It would be nice if we could track that an instruction
+  // belongs to two source atoms.
+  bool UseL1Atom = [L1, L2]() {
+if (L1->getAtomRank() == L2->getAtomRank()) {
+  // Arbitrarily choose the lowest non-zero group number.
+  if (!L1->getAtomGroup() || !L2->getAtomGroup())
+return !L2->getAtomGroup();
+  return L1->getAtomGroup() < L2->getAtomGroup();
+}
+// Choose the lowest non-zero rank.
+if (!L1->getAtomRank() || !L2->getAtomRank())
+  return !L2->getAtomRank();
+return L1->getAtomRank() < L2->getAtomRank();
+  }();
+  Group = UseL1Atom ? L1->getAtomGroup() : L2->getAtomGroup();
+  Rank = UseL1Atom ? L1->getAtomRank() : L2->getAtomRank();
+} else {
+  // If either instruction is part of a source atom, reassign it a new
+  // atom group. This essentially regresses to non-key-instructions
+  // behaviour (now that it's the only instruction in its group it'll
+  // probably get is_stmt applied).
+  Group = C.incNextAtomGroup();
+  Rank = 1;

SLTozer wrote:

This makes sense - although I still suspect there's some form of optimization 
we could do here (isolating the cases where atomGroups need to change), better 
to go with what definitely works here!

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Propagate DebugLocs on phis in BreakCriticalEdges (PR #133492)

2025-05-06 Thread Stephen Tozer via llvm-branch-commits

SLTozer wrote:

> I'm also not 100% sure if there's a good "policy" in place for PHI debug locs 
> (paging @SLTozer)

In most cases we do not set debug locs on PHI nodes or expect them to have 
debug locs, but there are some cases where we explicitly set/check them - most 
often in loop optimizations, where PHIs may have source locations relevant to 
the loop induction variable, and in InstCombine where we perform 
transformations between ordinary instructions and PHIs. I don't _think_ we have 
a well-defined policy in place, but since they're sometimes useful it's a good 
rule-of-thumb to propagate them if doing so makes sense.

https://github.com/llvm/llvm-project/pull/133492
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-10 Thread Stephen Tozer via llvm-branch-commits


@@ -189,11 +189,15 @@ DILocation *DILocation::getMergedLocation(DILocation 
*LocA, DILocation *LocB) {
 
   // Merge the two locations if possible, using the supplied
   // inlined-at location for the created location.
-  auto MergeLocPair = [&C](const DILocation *L1, const DILocation *L2,
-   DILocation *InlinedAt) -> DILocation * {
+  auto *LocAIA = LocA->getInlinedAt();
+  auto *LocBIA = LocB->getInlinedAt();
+  auto MergeLocPair = [&C, LocAIA,
+   LocBIA](const DILocation *L1, const DILocation *L2,
+   DILocation *InlinedAt) -> DILocation * {
 if (L1 == L2)
   return DILocation::get(C, L1->getLine(), L1->getColumn(), L1->getScope(),
- InlinedAt);
+ InlinedAt, L1->isImplicitCode(),

SLTozer wrote:

Technically copying `L1->isImplicitCode()` here is a change in behaviour - 
normally that flag would be effectively dropped here.

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [KeyInstr][Clang] Assign matrix element atom (PR #134650)

2025-05-27 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.


https://github.com/llvm/llvm-project/pull/134650
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-20 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From afeb26be5f099d384115a55b19707bbb2a730245 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 83 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 +++---
 2 files changed, 88 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index c2dbdc57eb3b5..460b5e50e42d7 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,52 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap> SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+unsigned VirtualFrameNo = 0;
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) {
+OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), 
std::log10(Depth) + 2)
+  << ' ' << SymbolizedFrame << '\n';
+  }
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -373,6 +427,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -448,14 +504,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << N

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-20 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 4410b5f351cad4cd611cbc773337197d5fa367b8 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index 999e03b6374a5..6d79aa6b2aa01 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -187,6 +209,19 @@ namespace llvm {
 #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
 }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 8e1ef24226789..ef382a9168f24 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index ff9f0ff5d5bc3..3b3e7a418feb5 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoM

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-20 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 4410b5f351cad4cd611cbc773337197d5fa367b8 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index 999e03b6374a5..6d79aa6b2aa01 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -187,6 +209,19 @@ namespace llvm {
 #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
 }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 8e1ef24226789..ef382a9168f24 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index ff9f0ff5d5bc3..3b3e7a418feb5 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoM

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-20 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From afeb26be5f099d384115a55b19707bbb2a730245 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 83 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 +++---
 2 files changed, 88 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index c2dbdc57eb3b5..460b5e50e42d7 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,52 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap> SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+unsigned VirtualFrameNo = 0;
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) {
+OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), 
std::log10(Depth) + 2)
+  << ' ' << SymbolizedFrame << '\n';
+  }
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -373,6 +427,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -448,14 +504,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << N

[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-06-09 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [KeyInstr][Clang] Coerced store atoms (PR #134653)

2025-05-30 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer approved this pull request.


https://github.com/llvm/llvm-project/pull/134653
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect symbolized stack traces (PR #143591)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer created 
https://github.com/llvm/llvm-project/pull/143593

None

>From c6f681d4eb307ca5f8859b3e4e7605fc2fa8441c Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:21 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in
 LLVM

---
 llvm/include/llvm/IR/Instruction.h | 2 +-
 llvm/lib/CodeGen/BranchFolding.cpp | 7 +++
 llvm/lib/IR/Instruction.cpp| 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/IR/Function.h"
@@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock 
*SuccBB,
 
   // Sort by hash value so that blocks with identical end sequences sort
   // together.
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  // If origin-tracking is enabled then MergePotentialElt is no longer a POD
+  // type, so we need std::sort instead.
+  std::sort(MergePotentials.begin(), MergePotentials.end());
+#else
   array_pod_sort(MergePotentials.begin(), MergePotentials.end());
+#endif
 
   // Walk through equivalence sets looking for actual exact matches.
   while (MergePotentials.size() > 1) {
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 109d516c61b7c..123bc7ecce01a 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst,
   setMetadata(MD.first, MD.second);
   }
   if (WL.empty() || WLS.count(LLVMContext::MD_dbg))
-setDebugLoc(SrcInst.getDebugLoc());
+setDebugLoc(SrcInst.getDebugLoc().getCopied());
 }
 
 Instruction *Instruction::clone() const {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer created 
https://github.com/llvm/llvm-project/pull/143594

None

>From 4786afd40d73ade22952ca43af1164c6f9545679 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 77 ++---
 llvm/utils/llvm-original-di-preservation.py | 22 +++---
 2 files changed, 80 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..a9a66baf5571f 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,20 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+{"metadata", "DILocation"}, {"fn-name", FnName.str()},
+{"bb-name", BBName.str()}, {"instr", InstName}, {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+{"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +533,7 @@ static bool checkInstructions(const DebugInstMap 
&D

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: SymbolizeAddresses (PR #143591)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer created 
https://github.com/llvm/llvm-project/pull/143591

None

>From d10a102637f2dcb215039df2cb248131c6a715ce Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFil

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Core implementation (PR #143592)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer created 
https://github.com/llvm/llvm-project/pull/143592

None

>From 8ff21d6e7630b0407931712eb652e0416ce661d8 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h | 62 +
 llvm/lib/IR/DebugLoc.cpp| 22 +++-
 2 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..bc890dd671a81 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,32 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+DebugLoc(DebugLocKind Kind) : Loc(Kind) {}
+DebugLocKind getKind() const { return Loc.Kind; }
+#endif
+
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#if !LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#error Cannot enable DebugLoc origin-tracking without coverage-tracking!
+#endif
+
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
+static DebugLoc getTemporary();
+static DebugLoc getUnknown();
+static DebugLoc getLineZero();
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp
index 0e65ddcec8934..05aad5d393547 100644
--- a/llvm/lib/IR/DebugLoc.cpp
+++ b/llvm/lib/IR/DebugLoc.cpp
@@ -9,11 +9,31 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfo.h"
+
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#include "llvm/Support/Signals.h"
+
+namespace llvm {
+DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) {
+  if (ShouldCollectTrace) {
+auto &[Depth, StackTrace] = StackTraces.emplace_back();
+Depth = sys::getStackTrace(StackTrace);
+  }
+}
+void DbgLocOrigin::addTrace() {
+  if (StackTraces.empty())
+return;
+  auto &[Depth, StackTrace] = StackTraces.emplace_back();
+  Dept

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: SymbolizeAddresses (PR #143591)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From b2ecf5ed0da6fd3e03192ae921680b7576c12365 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: SymbolizeAddresses (PR #143591)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From b2ecf5ed0da6fd3e03192ae921680b7576c12365 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From c973e73b792cc1440af7c9001a0ddcfef94a9e21 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 77 ++---
 llvm/utils/llvm-original-di-preservation.py | 22 +++---
 2 files changed, 80 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..a9a66baf5571f 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,20 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+{"metadata", "DILocation"}, {"fn-name", FnName.str()},
+{"bb-name", BBName.str()}, {"instr", InstName}, {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+{"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +533,7 @@ static bool checkInstructions(const DebugInstMap 
&DILocsB

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From c973e73b792cc1440af7c9001a0ddcfef94a9e21 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 77 ++---
 llvm/utils/llvm-original-di-preservation.py | 22 +++---
 2 files changed, 80 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..a9a66baf5571f 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,20 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+{"metadata", "DILocation"}, {"fn-name", FnName.str()},
+{"bb-name", BBName.str()}, {"instr", InstName}, {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+{"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +533,7 @@ static bool checkInstructions(const DebugInstMap 
&DILocsB

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143593

>From eff0813afb187a5bba4f59d63120d9dd131a3a67 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:21 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in
 LLVM

---
 llvm/include/llvm/IR/Instruction.h | 2 +-
 llvm/lib/CodeGen/BranchFolding.cpp | 7 +++
 llvm/lib/IR/Instruction.cpp| 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/IR/Function.h"
@@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock 
*SuccBB,
 
   // Sort by hash value so that blocks with identical end sequences sort
   // together.
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  // If origin-tracking is enabled then MergePotentialElt is no longer a POD
+  // type, so we need std::sort instead.
+  std::sort(MergePotentials.begin(), MergePotentials.end());
+#else
   array_pod_sort(MergePotentials.begin(), MergePotentials.end());
+#endif
 
   // Walk through equivalence sets looking for actual exact matches.
   while (MergePotentials.size() > 1) {
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 109d516c61b7c..123bc7ecce01a 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst,
   setMetadata(MD.first, MD.second);
   }
   if (WL.empty() || WLS.count(LLVMContext::MD_dbg))
-setDebugLoc(SrcInst.getDebugLoc());
+setDebugLoc(SrcInst.getDebugLoc().getCopied());
 }
 
 Instruction *Instruction::clone() const {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Core implementation (PR #143592)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 6ade803aa6c7e0137e4e572d379238a9d1fc202e Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h | 49 -
 llvm/lib/IR/DebugLoc.cpp| 22 ++-
 2 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp
index 0e65ddcec8934..05aad5d393547 100644
--- a/llvm/lib/IR/DebugLoc.cpp
+++ b/llvm/lib/IR/DebugLoc.cpp
@@ -9,11 +9,31 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfo.h"
+
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#include "llvm/Support/Signals.h"
+
+namespace llvm {
+DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) {
+  if (ShouldCollectTrace) {
+auto &[Depth, StackTrace] = StackTraces.emplace_back();
+Depth = sys::getStackTrace(StackTrace);
+  }
+}
+void DbgLocOrigin::addTrace() {
+  if (StackTraces.empty())
+return;
+  auto &[Depth, StackTrace] = StackTraces.emplace_back();
+  Depth = sys::getStackTrace(StackTrace);
+}
+} // namespace llvm
+#endif
+
 using namespace llvm;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L)
-: TrackingMDNodeRef(const_cast(L)),
+: TrackingMDNodeRef(const_cast(L)), DbgLocOrigin(!L),
   Kind(DebugLocKind::Normal) {}
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Core implementation (PR #143592)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 6ade803aa6c7e0137e4e572d379238a9d1fc202e Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h | 49 -
 llvm/lib/IR/DebugLoc.cpp| 22 ++-
 2 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp
index 0e65ddcec8934..05aad5d393547 100644
--- a/llvm/lib/IR/DebugLoc.cpp
+++ b/llvm/lib/IR/DebugLoc.cpp
@@ -9,11 +9,31 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfo.h"
+
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#include "llvm/Support/Signals.h"
+
+namespace llvm {
+DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) {
+  if (ShouldCollectTrace) {
+auto &[Depth, StackTrace] = StackTraces.emplace_back();
+Depth = sys::getStackTrace(StackTrace);
+  }
+}
+void DbgLocOrigin::addTrace() {
+  if (StackTraces.empty())
+return;
+  auto &[Depth, StackTrace] = StackTraces.emplace_back();
+  Depth = sys::getStackTrace(StackTrace);
+}
+} // namespace llvm
+#endif
+
 using namespace llvm;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L)
-: TrackingMDNodeRef(const_cast(L)),
+: TrackingMDNodeRef(const_cast(L)), DbgLocOrigin(!L),
   Kind(DebugLocKind::Normal) {}
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer closed 
https://github.com/llvm/llvm-project/pull/143593
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143594
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer ready_for_review 
https://github.com/llvm/llvm-project/pull/143594
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer ready_for_review 
https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143594
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From 4bbd28b23847c069445d9babe9aa8a8aac5036c1 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 80 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 ---
 2 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..01ed9de51c0b2 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +536,7 @@ static bool checkInstructi

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From 12f5a10c1dc2ae6943947c85a5bd05a295ae1c7c Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 5e5629149de6f5929a4a1a1986281a201046fd01 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From f47991b0264f1fbf14e93941e7e9398d4e8e0ae3 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From e46273bec027d0accfbe6d3de9880c29977c6858 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 80 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 ---
 2 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..01ed9de51c0b2 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +536,7 @@ static bool checkInstructi

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From e46273bec027d0accfbe6d3de9880c29977c6858 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 80 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 ---
 2 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..01ed9de51c0b2 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +536,7 @@ static bool checkInstructi

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From f47991b0264f1fbf14e93941e7e9398d4e8e0ae3 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 5e5629149de6f5929a4a1a1986281a201046fd01 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include 

[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-06-09 Thread Stephen Tozer via llvm-branch-commits


@@ -0,0 +1,2348 @@
+//===-- LVIRReader.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This implements the LVIRReader class.
+// It supports LLVM text IR and bitcode format.
+//
+//===--===//
+
+#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h"
+#include "llvm/CodeGen/DebugHandlerBase.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVLine.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVScope.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVType.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Object/IRObjectFile.h"
+#include "llvm/Support/FormatAdapters.h"
+#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SourceMgr.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::logicalview;
+
+#define DEBUG_TYPE "IRReader"
+
+// Extra debug traces. Default is false
+#define DEBUG_ALL
+
+// These flavours of DINodes are not handled:
+//   DW_TAG_APPLE_property   = 19896
+//   DW_TAG_atomic_type  = 71
+//   DW_TAG_common_block = 26
+//   DW_TAG_file_type= 41
+//   DW_TAG_friend   = 42
+//   DW_TAG_generic_subrange = 69
+//   DW_TAG_immutable_type   = 75
+//   DW_TAG_module   = 30
+
+// Create a logical element and setup the following information:
+// - Name, DWARF tag, line
+// - Collect any file information
+LVElement *LVIRReader::constructElement(const DINode *DN) {
+  dwarf::Tag Tag = DN->getTag();
+  LVElement *Element = createElement(Tag);
+  if (Element) {
+Element->setTag(Tag);
+addMD(DN, Element);
+
+StringRef Name = getMDName(DN);
+if (!Name.empty())
+  Element->setName(Name);
+
+// Record any file information.
+if (const DIFile *File = getMDFile(DN))
+  getOrCreateSourceID(File);
+  }
+
+  return Element;
+}
+
+void LVIRReader::mapFortranLanguage(unsigned DWLang) {
+  switch (DWLang) {
+  case dwarf::DW_LANG_Fortran77:
+  case dwarf::DW_LANG_Fortran90:
+  case dwarf::DW_LANG_Fortran95:
+  case dwarf::DW_LANG_Fortran03:
+  case dwarf::DW_LANG_Fortran08:
+  case dwarf::DW_LANG_Fortran18:
+LanguageIsFortran = true;
+break;
+  default:
+LanguageIsFortran = false;
+  }
+}
+
+// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only
+// difference is setting the 'DICompileUnit::splitDebugFilename' to the
+// name of the split filename: "xxx.dwo".
+bool LVIRReader::includeMinimalInlineScopes() const {
+  return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly;
+}
+
+// For the given 'DIFile' generate an index 1-based to indicate the
+// source file where the logical element is declared.
+// In DWARF v4, the files are 1-indexed.
+// In DWARF v5, the files are 0-indexed.
+// The IR reader expects the indexes as 1-indexed.
+// Each compile unit, keeps track of the last assigned index.
+size_t LVIRReader::getOrCreateSourceID(const DIFile *File) {
+  if (!File)
+return 0;
+
+#ifdef DEBUG_ALL
+  LLVM_DEBUG({
+dbgs() << "\n[getOrCreateSourceID] DIFile\n";
+File->dump();
+  });
+#endif
+
+  addMD(File, CompileUnit);
+
+  LLVM_DEBUG({
+dbgs() << "Directory: '" << File->getDirectory() << "'\n";
+dbgs() << "Filename:  '" << File->getFilename() << "'\n";
+  });
+  size_t FileIndex = 0;
+  LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File);
+  if (Iter == CompileUnitFiles.cend()) {
+FileIndex = getFileIndex(CompileUnit);
+std::string Directory(File->getDirectory());
+if (Directory.empty())
+  Directory = std::string(CompileUnit->getCompilationDirectory());
+
+std::string FullName;
+raw_string_ostream Out(FullName);
+Out << Directory << "/" << llvm::sys::path::filename(File->getFilename());
+CompileUnit->addFilename(transformPath(FullName));
+CompileUnitFiles.emplace(File, ++FileIndex);
+updateFileIndex(CompileUnit, FileIndex);
+  } else {
+FileIndex = Iter->second;
+  }
+
+  LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; });
+  return FileIndex;
+}
+
+void LVIRReader::addSourceLine(LVElement *Element, unsigned Line,
+   const DIFile *File) {
+  if (Line == 0)
+return;
+
+  // After the scopes are created, the generic reader traverses the 'Children'
+  // and perform additional setting tasks (resolve types names, references,
+  // etc.). One of those tasks is select the correct string pool index based on
+  // the commmand line o

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits


@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+  sys::fs::createTemporaryFile("symbolizer-output", "", OutputFile);
+  FileRemover InputRemover(InputFile.c_str());
+  FileRemover OutputRemover(OutputFile.c_str());
+
+  {
+raw_fd_ostream Input(InputFD, true);
+for (int i = 0; i < NumAddresses; i++) {
+  if (Modules[i])
+Input << Modules[i] << " " << (void *)Offsets[i] << "\n";
+}
+  }
+
+  std::optional Redirects[] = {InputFile.str(), OutputFile.str(),
+  StringRef("")};
+  StringRef Args[] = {"llvm-symbolizer", "--functions=linkage", "--inlining",
+#ifdef _WIN32
+  // Pass --relative-address on Windows so that we don't
+  // have to add ImageBase from PE file.
+  // FIXME: Make this the default for llvm-symbolizer.
+  "--relative-address",
+#endif

SLTozer wrote:

We don't need it for now, but since this is copied from the existing invocation 
of the symbolizer (and I'm currently looking at extracting this and a few other 
parts out), it's more effort to remove support for Windows than to keep it; I 
do intend to add Windows support later on, it's just trickier than using 
`backtrace()` and not an urgent feature.

https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

SLTozer wrote:

> A bigger question though is whether this can be tested.

That's a good question - in theory yes, but it sounds brittle. Assuming we're 
testing the output of backtrace/symbolize, we'd have a test that could fail if 
any of a variety of function names changed. Another argument against testing is 
that this code exists to generate formatted, human readable output - most of 
the complicated parts are in `llvm-symbolizer` and `libbacktrace`, while these 
functions are essentially just invoking them and formatting the output.

All the same, testing is generally good; I'll consider how would be best to 
test this, but I will also happily take suggestions if you/others have any.

https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer ready_for_review 
https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-11 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in LLVM (PR #143593)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143593

>From eff0813afb187a5bba4f59d63120d9dd131a3a67 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:21 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Handle origin-tracking elsewhere in
 LLVM

---
 llvm/include/llvm/IR/Instruction.h | 2 +-
 llvm/lib/CodeGen/BranchFolding.cpp | 7 +++
 llvm/lib/IR/Instruction.cpp| 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/IR/Function.h"
@@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock 
*SuccBB,
 
   // Sort by hash value so that blocks with identical end sequences sort
   // together.
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  // If origin-tracking is enabled then MergePotentialElt is no longer a POD
+  // type, so we need std::sort instead.
+  std::sort(MergePotentials.begin(), MergePotentials.end());
+#else
   array_pod_sort(MergePotentials.begin(), MergePotentials.end());
+#endif
 
   // Walk through equivalence sets looking for actual exact matches.
   while (MergePotentials.size() > 1) {
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 109d516c61b7c..123bc7ecce01a 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst,
   setMetadata(MD.first, MD.second);
   }
   if (WL.empty() || WLS.count(LLVMContext::MD_dbg))
-setDebugLoc(SrcInst.getDebugLoc());
+setDebugLoc(SrcInst.getDebugLoc().getCopied());
 }
 
 Instruction *Instruction::clone() const {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-10 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits


@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.

SLTozer wrote:

The purpose of this is simply to prevent people from accidentally shipping any 
release compiler with this code compiled in.

https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits


@@ -507,6 +507,21 @@ static int dl_iterate_phdr_cb(dl_phdr_info *info, size_t 
size, void *arg) {
   return 0;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#if !defined(HAVE_BACKTRACE)
+#error DebugLoc origin-tracking currently requires `backtrace()`.
+#endif

SLTozer wrote:

Possibly, I'll look into it - though if possible I'd prefer to replace the use 
of `backtrace()` with in-program stack traversal, which would require 
`-fno-omit-frame-pointer` (I'm not sure how you'd check that in CMake either) 
but would be significantly faster than making the lib call, so if it turns out 
to be non-trivial (e.g. if there's some ordering issue in the CMake config) 
then it may not be worth it.

https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits


@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+  sys::fs::createTemporaryFile("symbolizer-output", "", OutputFile);
+  FileRemover InputRemover(InputFile.c_str());
+  FileRemover OutputRemover(OutputFile.c_str());
+
+  {
+raw_fd_ostream Input(InputFD, true);
+for (int i = 0; i < NumAddresses; i++) {
+  if (Modules[i])
+Input << Modules[i] << " " << (void *)Offsets[i] << "\n";
+}
+  }
+
+  std::optional Redirects[] = {InputFile.str(), OutputFile.str(),
+  StringRef("")};
+  StringRef Args[] = {"llvm-symbolizer", "--functions=linkage", "--inlining",
+#ifdef _WIN32
+  // Pass --relative-address on Windows so that we don't
+  // have to add ImageBase from PE file.
+  // FIXME: Make this the default for llvm-symbolizer.
+  "--relative-address",
+#endif
+  "--demangle"};
+  int RunResult =
+  sys::ExecuteAndWait(LLVMSymbolizerPath, Args, std::nullopt, Redirects);
+  if (RunResult != 0)
+return;
+
+  // This report format is based on the sanitizer stack trace printer.  See
+  // sanitizer_stacktrace_printer.cc in compiler-rt.
+  auto OutputBuf = MemoryBuffer::getFile(OutputFile.c_str());
+  if (!OutputBuf)
+return;
+  StringRef Output = OutputBuf.get()->getBuffer();
+  SmallVector Lines;
+  Output.split(Lines, "\n");
+  auto CurLine = Lines.begin();
+  for (int i = 0; i < NumAddresses; i++) {
+assert(!SymbolizedAddresses.contains(AddressList[i]));
+std::string &SymbolizedAddr = SymbolizedAddresses[AddressList[i]];
+raw_string_ostream OS(SymbolizedAddr);
+if (!Modules[i]) {
+  OS << format_ptr(AddressList[i]) << '\n';
+  continue;
+}
+// Read pairs of lines (function name and file/line info) until we
+// encounter empty line.
+for (bool IsFirst = true;; IsFirst = false) {

SLTozer wrote:

I wouldn't exactly say so - we're iterating over the call stack, which means 
iterating over the real stack frames, and for each real stack frame iterating 
over all the inlined calls at that frame's PC, so I would say that we're 
iterating linearly over the set of real+inlined frames.

https://github.com/llvm/llvm-project/pull/143591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From ef6ccda96703764bbed694f910d56d8a3af27730 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 80 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 ---
 2 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..01ed9de51c0b2 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +536,7 @@ static bool checkInstructi

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 2ff6e13069844c443ce8ff5677b3930e970665cf Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From 622d1fb6df403dc9457b42c9d8f70b8004eb06a5 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From ef6ccda96703764bbed694f910d56d8a3af27730 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 80 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 ---
 2 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 729813a92f516..01ed9de51c0b2 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,49 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  OS << right_justify(formatv("#{0}", Frame).str(), std::log10(Depth) + 2)
+ << ' ' << SymbolizedAddrs[StackTrace[Frame]];
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -379,6 +430,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -454,14 +507,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << NameOfWrappedPass
   << " did not generate DILocation for " << *Instr
@@ -474,11 +536,7 @@ static bool checkInstructi

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (PR #143591)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143591

>From 622d1fb6df403dc9457b42c9d8f70b8004eb06a5 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 19:58:09 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: SymbolizeAddresses

---
 llvm/include/llvm/Support/Signals.h  |  40 +
 llvm/lib/Support/Signals.cpp | 116 +++
 llvm/lib/Support/Unix/Signals.inc|  15 
 llvm/lib/Support/Windows/Signals.inc |   5 ++
 4 files changed, 176 insertions(+)

diff --git a/llvm/include/llvm/Support/Signals.h 
b/llvm/include/llvm/Support/Signals.h
index 6ce26acdd458e..a6f99d8bbdc95 100644
--- a/llvm/include/llvm/Support/Signals.h
+++ b/llvm/include/llvm/Support/Signals.h
@@ -14,7 +14,9 @@
 #ifndef LLVM_SUPPORT_SIGNALS_H
 #define LLVM_SUPPORT_SIGNALS_H
 
+#include "llvm/Config/llvm-config.h"
 #include "llvm/Support/Compiler.h"
+#include 
 #include 
 #include 
 
@@ -22,6 +24,22 @@ namespace llvm {
 class StringRef;
 class raw_ostream;
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+// Typedefs that are convenient but only used by the stack-trace-collection 
code
+// added if DebugLoc origin-tracking is enabled.
+template  struct DenseMapInfo;
+template  class DenseSet;
+namespace detail {
+template  struct DenseMapPair;
+}
+template 
+class DenseMap;
+using AddressSet = DenseSet>;
+using SymbolizedAddressMap =
+DenseMap,
+ detail::DenseMapPair>;
+#endif
+
 namespace sys {
 
 /// This function runs all the registered interrupt handlers, including the
@@ -57,6 +75,28 @@ LLVM_ABI void DisableSystemDialogsOnCrash();
 ///specified, the entire frame is printed.
 LLVM_ABI void PrintStackTrace(raw_ostream &OS, int Depth = 0);
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#ifdef NDEBUG
+#error DebugLoc origin-tracking should not be enabled in Release builds.
+#endif
+/// Populates the given array with a stack trace of the current program, up to
+/// MaxDepth frames. Returns the number of frames returned, which will be
+/// inserted into \p StackTrace from index 0. All entries after the returned
+/// depth will be unmodified. NB: This is only intended to be used for
+/// introspection of LLVM by Debugify, will not be enabled in release builds,
+/// and should not be relied on for other purposes.
+template 
+int getStackTrace(std::array &StackTrace);
+
+/// Takes a set of \p Addresses, symbolizes them and stores the result in the
+/// provided \p SymbolizedAddresses map.
+/// NB: This is only intended to be used for introspection of LLVM by
+/// Debugify, will not be enabled in release builds, and should not be relied
+/// on for other purposes.
+void symbolizeAddresses(AddressSet &Addresses,
+SymbolizedAddressMap &SymbolizedAddresses);
+#endif
+
 // Run all registered signal handlers.
 LLVM_ABI void RunSignalHandlers();
 
diff --git a/llvm/lib/Support/Signals.cpp b/llvm/lib/Support/Signals.cpp
index 9f9030e79d104..50b0d6e78ddd1 100644
--- a/llvm/lib/Support/Signals.cpp
+++ b/llvm/lib/Support/Signals.cpp
@@ -253,6 +253,122 @@ static bool printSymbolizedStackTrace(StringRef Argv0, 
void **StackTrace,
   return true;
 }
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+void sys::symbolizeAddresses(AddressSet &Addresses,
+ SymbolizedAddressMap &SymbolizedAddresses) {
+  assert(!DisableSymbolicationFlag && !getenv(DisableSymbolizationEnv) &&
+ "Debugify origin stacktraces require symbolization to be enabled.");
+
+  // Convert Set of Addresses to ordered list.
+  SmallVector AddressList(Addresses.begin(), Addresses.end());
+  if (AddressList.empty())
+return;
+  int NumAddresses = AddressList.size();
+  llvm::sort(AddressList);
+
+  // Use llvm-symbolizer tool to symbolize the stack traces. First look for it
+  // alongside our binary, then in $PATH.
+  ErrorOr LLVMSymbolizerPathOrErr = std::error_code();
+  if (const char *Path = getenv(LLVMSymbolizerPathEnv)) {
+LLVMSymbolizerPathOrErr = sys::findProgramByName(Path);
+  }
+  if (!LLVMSymbolizerPathOrErr)
+LLVMSymbolizerPathOrErr = sys::findProgramByName("llvm-symbolizer");
+  assert(!!LLVMSymbolizerPathOrErr &&
+ "Debugify origin stacktraces require llvm-symbolizer.");
+  const std::string &LLVMSymbolizerPath = *LLVMSymbolizerPathOrErr;
+
+  // Try to guess the main executable name, since we don't have argv0 available
+  // here.
+  std::string MainExecutableName = sys::fs::getMainExecutable(nullptr, 
nullptr);
+
+  BumpPtrAllocator Allocator;
+  StringSaver StrPool(Allocator);
+  std::vector Modules(NumAddresses, nullptr);
+  std::vector Offsets(NumAddresses, 0);
+  if (!findModulesAndOffsets(AddressList.data(), NumAddresses, Modules.data(),
+ Offsets.data(), MainExecutableName.c_str(),
+ StrPool))
+return;
+  int InputFD;
+  SmallString<32> InputFile, OutputFile;
+  sys::fs::createTemporaryFile("symbolizer-input", "", InputFD, InputFile);
+ 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-06-12 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From 2ff6e13069844c443ce8ff5677b3930e970665cf Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
 static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include 

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits

SLTozer wrote:

New PR: https://github.com/llvm/llvm-project/pull/146678

https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143594
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits

SLTozer wrote:

> Are you planning to extend documentation

Yes, and it probably is best if the documentation lands in this patch!

https://github.com/llvm/llvm-project/pull/143594
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-07-01 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From e2ff01bc95a78c4372bdf538f0433dc882c070f8 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 83 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 +++---
 2 files changed, 88 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 5f70bc442d2f0..e8ed55a99546e 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,52 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap> SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+unsigned VirtualFrameNo = 0;
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) {
+OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), 
std::log10(Depth) + 2)
+  << ' ' << SymbolizedFrame << '\n';
+  }
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -375,6 +429,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -450,14 +506,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << N

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-01 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From fb65cb7f043586eb6808b229fd1ad018ffd7571d Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index 999e03b6374a5..6d79aa6b2aa01 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -187,6 +209,19 @@ namespace llvm {
 #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
 }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 8e1ef24226789..ef382a9168f24 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index ff9f0ff5d5bc3..3b3e7a418feb5 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoM

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-07-01 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143594

>From e2ff01bc95a78c4372bdf538f0433dc882c070f8 Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:02:36 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Add debugify support

---
 llvm/lib/Transforms/Utils/Debugify.cpp  | 83 ++---
 llvm/utils/llvm-original-di-preservation.py | 24 +++---
 2 files changed, 88 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp 
b/llvm/lib/Transforms/Utils/Debugify.cpp
index 5f70bc442d2f0..e8ed55a99546e 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -15,7 +15,10 @@
 
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Config/config.h"
 #include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/InstIterator.h"
@@ -28,6 +31,11 @@
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/JSON.h"
 #include 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// We need the Signals header to operate on stacktraces if we're using DebugLoc
+// origin-tracking.
+#include "llvm/Support/Signals.h"
+#endif
 
 #define DEBUG_TYPE "debugify"
 
@@ -59,6 +67,52 @@ cl::opt DebugifyLevel(
 
 raw_ostream &dbg() { return Quiet ? nulls() : errs(); }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+// These maps refer to addresses in this instance of LLVM, so we can reuse them
+// everywhere - therefore, we store them at file scope.
+static DenseMap> SymbolizedAddrs;
+static DenseSet UnsymbolizedAddrs;
+
+std::string symbolizeStackTrace(const Instruction *I) {
+  // We flush the set of unsymbolized addresses at the latest possible moment,
+  // i.e. now.
+  if (!UnsymbolizedAddrs.empty()) {
+sys::symbolizeAddresses(UnsymbolizedAddrs, SymbolizedAddrs);
+UnsymbolizedAddrs.clear();
+  }
+  auto OriginStackTraces = I->getDebugLoc().getOriginStackTraces();
+  std::string Result;
+  raw_string_ostream OS(Result);
+  for (size_t TraceIdx = 0; TraceIdx < OriginStackTraces.size(); ++TraceIdx) {
+if (TraceIdx != 0)
+  OS << "\n";
+auto &[Depth, StackTrace] = OriginStackTraces[TraceIdx];
+unsigned VirtualFrameNo = 0;
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  assert(SymbolizedAddrs.contains(StackTrace[Frame]) &&
+ "Expected each address to have been symbolized.");
+  for (std::string &SymbolizedFrame : SymbolizedAddrs[StackTrace[Frame]]) {
+OS << right_justify(formatv("#{0}", VirtualFrameNo++).str(), 
std::log10(Depth) + 2)
+  << ' ' << SymbolizedFrame << '\n';
+  }
+}
+  }
+  return Result;
+}
+void collectStackAddresses(Instruction &I) {
+  auto &OriginStackTraces = I.getDebugLoc().getOriginStackTraces();
+  for (auto &[Depth, StackTrace] : OriginStackTraces) {
+for (int Frame = 0; Frame < Depth; ++Frame) {
+  void *Addr = StackTrace[Frame];
+  if (!SymbolizedAddrs.contains(Addr))
+UnsymbolizedAddrs.insert(Addr);
+}
+  }
+}
+#else
+void collectStackAddresses(Instruction &I) {}
+#endif // LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+
 uint64_t getAllocSizeInBits(Module &M, Type *Ty) {
   return Ty->isSized() ? M.getDataLayout().getTypeAllocSizeInBits(Ty) : 0;
 }
@@ -375,6 +429,8 @@ bool llvm::collectDebugInfoMetadata(Module &M,
 LLVM_DEBUG(dbgs() << "  Collecting info for inst: " << I << '\n');
 DebugInfoBeforePass.InstToDelete.insert({&I, &I});
 
+// Track the addresses to symbolize, if the feature is enabled.
+collectStackAddresses(I);
 DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
   }
 }
@@ -450,14 +506,23 @@ static bool checkInstructions(const DebugInstMap 
&DILocsBefore,
 auto BBName = BB->hasName() ? BB->getName() : "no-name";
 auto InstName = Instruction::getOpcodeName(Instr->getOpcode());
 
+auto CreateJSONBugEntry = [&](const char *Action) {
+  Bugs.push_back(llvm::json::Object({
+  {"metadata", "DILocation"},
+  {"fn-name", FnName.str()},
+  {"bb-name", BBName.str()},
+  {"instr", InstName},
+  {"action", Action},
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  {"origin", symbolizeStackTrace(Instr)},
+#endif
+  }));
+};
+
 auto InstrIt = DILocsBefore.find(Instr);
 if (InstrIt == DILocsBefore.end()) {
   if (ShouldWriteIntoJSON)
-Bugs.push_back(llvm::json::Object({{"metadata", "DILocation"},
-   {"fn-name", FnName.str()},
-   {"bb-name", BBName.str()},
-   {"instr", InstName},
-   {"action", "not-generate"}}));
+CreateJSONBugEntry("not-generate");
   else
 dbg() << "WARNING: " << N

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-01 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer updated 
https://github.com/llvm/llvm-project/pull/143592

>From fb65cb7f043586eb6808b229fd1ad018ffd7571d Mon Sep 17 00:00:00 2001
From: Stephen Tozer 
Date: Tue, 10 Jun 2025 20:00:51 +0100
Subject: [PATCH] [DLCov] Origin-Tracking: Core implementation

---
 llvm/include/llvm/IR/DebugLoc.h| 49 +-
 llvm/include/llvm/IR/Instruction.h |  2 +-
 llvm/lib/CodeGen/BranchFolding.cpp |  7 +
 llvm/lib/IR/DebugLoc.cpp   | 22 +-
 4 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index 999e03b6374a5..6d79aa6b2aa01 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;
+StackTracesTy StackTraces;
+DbgLocOrigin(bool ShouldCollectTrace);
+void addTrace();
+const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {
   public:
 DebugLocKind Kind;
 // Default constructor for empty DebugLocs.
 DILocAndCoverageTracking()
-: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-// Valid or nullptr MDNode*, normal DebugLocKind.
+: TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+  Kind(DebugLocKind::Normal) {}
+// Valid or nullptr MDNode*, no annotative DebugLocKind.
 DILocAndCoverageTracking(const MDNode *Loc)
-: TrackingMDNodeRef(const_cast(Loc)),
+: TrackingMDNodeRef(const_cast(Loc)), DbgLocOrigin(!Loc),
   Kind(DebugLocKind::Normal) {}
 LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
 // Explicit DebugLocKind, which always means a nullptr MDNode*.
 DILocAndCoverageTracking(DebugLocKind Kind)
-: TrackingMDNodeRef(nullptr), Kind(Kind) {}
+: TrackingMDNodeRef(nullptr),
+  DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type {
 using SimpleType = MDNode *;
@@ -187,6 +209,19 @@ namespace llvm {
 #endif // LLVM_ENABLE_DEBUGLOC_TRACKING_COVERAGE
 }
 
+#if LLVM_ENABLE_DEBUGLOC_TRACKING_ORIGIN
+const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+  return Loc.getOriginStackTraces();
+}
+DebugLoc getCopied() const {
+  DebugLoc NewDL = *this;
+  NewDL.Loc.addTrace();
+  return NewDL;
+}
+#else
+DebugLoc getCopied() const { return *this; }
+#endif
+
 /// Get the underlying \a DILocation.
 ///
 /// \pre !*this or \c isa(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 8e1ef24226789..ef382a9168f24 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index ff9f0ff5d5bc3..3b3e7a418feb5 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoM

[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits

SLTozer wrote:

Clicked the wrong button and accidentally merged the wrong branch (fortunately, 
this just merged into another PR branch, not main) - will reopen imminently, as 
github apparently won't allow me to reopen this PR in-place!

https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits


@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+static constexpr unsigned long MaxDepth = 16;
+using StackTracesTy =
+SmallVector>, 0>;

SLTozer wrote:

Most of the time we store 0 stacktraces, but when we do store a stacktrace 
there may be any number of them - the `addTrace` function adds a new stacktrace 
to the vector, and is used whenever the DebugLoc is "transferred", so that we 
can track the motion of a missing debug location through the program.

https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits


@@ -55,22 +70,29 @@ namespace llvm {
 Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+   public DbgLocOrigin {

SLTozer wrote:

We could manage without it, but the reason I use inheritance here is that this 
allows `DILocAndCoverageTracking` to automatically have the same public 
functions as `DbgLocOrigin`, meaning we conditionally have the origin-tracking 
functions enabled without having to repeat ourselves here.

https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (PR #143592)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer closed 
https://github.com/llvm/llvm-project/pull/143592
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DLCov] Origin-Tracking: Add debugify support (PR #143594)

2025-07-02 Thread Stephen Tozer via llvm-branch-commits

https://github.com/SLTozer edited 
https://github.com/llvm/llvm-project/pull/143594
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits