Author: royitaqi
Date: 2025-07-16T07:57:13-07:00
New Revision: a13712ed88cf6fc37d3789d4c3b54ffdd1a05a1d

URL: 
https://github.com/llvm/llvm-project/commit/a13712ed88cf6fc37d3789d4c3b54ffdd1a05a1d
DIFF: 
https://github.com/llvm/llvm-project/commit/a13712ed88cf6fc37d3789d4c3b54ffdd1a05a1d.diff

LOG: [lldb] Fix a crash in lldb-server during RemoveSoftwareBreakpoint() 
(#148738)

# Lldb-server crash

We have seen stacks like the following in lldb-server core dumps:
```
[
  "__GI___pthread_kill at pthread_kill.c:46",
  "__GI_raise at raise.c:26",
  "__GI_abort at abort.c:100",
  "__assert_fail_base at assert.c:92",
  "__GI___assert_fail at assert.c:101",
  "lldb_private::NativeProcessProtocol::RemoveSoftwareBreakpoint(unsigned long) 
at /redacted/lldb-server:0"
]
```

# Hypothesis of root cause

In `NativeProcessProtocol::RemoveSoftwareBreakpoint()`
([code](https://github.com/llvm/llvm-project/blob/19b2dd9d798c124406b0124a1b8debb711675281/lldb/source/Host/common/NativeProcessProtocol.cpp#L359-L423)),
a `ref_count` is asserted and reduced. If it becomes zero, the code
first go through a series of memory reads and writes to remove the
breakpoint trap opcode and to restore the original process code, then,
if everything goes fine, removes the entry from the map
`m_software_breakpoints` at the end of the function.

However, if any of the validations for the above reads and writes goes
wrong, the code returns an error early, skipping the removal of the
entry. This leaves the entry behind with a `ref_count` of zero.

The next call to `NativeProcessProtocol::RemoveSoftwareBreakpoint()` for
the same breakpoint[*] would violate the assertion about `ref_count > 0`
([here](https://github.com/llvm/llvm-project/blob/19b2dd9d798c124406b0124a1b8debb711675281/lldb/source/Host/common/NativeProcessProtocol.cpp#L365)),
which would cause a crash.

[*] We haven't found a *regular* way to repro such a next call in lldb
or lldb-dap. This is because both of them remove the breakpoint from
their internal list when they get any response from the lldb-server (OK
or error). Asking the client to delete the breakpoint a second time
doesn't trigger the client to send the `$z` gdb packet to lldb-server.
We are able to trigger the crash by sending the `$z` packet directly,
see "Manual test" below.

# Fix

Lift the removal of the map entry to be immediately after the decrement
of `ref_count`, before the early returns. This ensures that the asserted
case will never happen. The validation errors can still happen, and
whether they happen or not, the breakpoint has been removed from the
perspective of the lldb-server (same as that of lldb and lldb-dap).


# Manual test & unit test

See PR.

Added: 
    

Modified: 
    lldb/source/Host/common/NativeProcessProtocol.cpp
    lldb/unittests/Host/NativeProcessProtocolTest.cpp

Removed: 
    


################################################################################
diff  --git a/lldb/source/Host/common/NativeProcessProtocol.cpp 
b/lldb/source/Host/common/NativeProcessProtocol.cpp
index 405acbb5662d6..196f54b93538d 100644
--- a/lldb/source/Host/common/NativeProcessProtocol.cpp
+++ b/lldb/source/Host/common/NativeProcessProtocol.cpp
@@ -366,12 +366,19 @@ Status 
NativeProcessProtocol::RemoveSoftwareBreakpoint(lldb::addr_t addr) {
   if (--it->second.ref_count > 0)
     return Status();
 
+  // Remove the entry from m_software_breakpoints rightaway, so that we don't
+  // leave behind an entry with ref_count == 0 in case one of the following
+  // conditions returns an error. The breakpoint is moved so that it can be
+  // accessed below.
+  SoftwareBreakpoint bkpt = std::move(it->second);
+  m_software_breakpoints.erase(it);
+
   // This is the last reference. Let's remove the breakpoint.
   Status error;
 
   // Clear a software breakpoint instruction
-  llvm::SmallVector<uint8_t, 4> curr_break_op(
-      it->second.breakpoint_opcodes.size(), 0);
+  llvm::SmallVector<uint8_t, 4> curr_break_op(bkpt.breakpoint_opcodes.size(),
+                                              0);
 
   // Read the breakpoint opcode
   size_t bytes_read = 0;
@@ -382,10 +389,10 @@ Status 
NativeProcessProtocol::RemoveSoftwareBreakpoint(lldb::addr_t addr) {
         "addr=0x%" PRIx64 ": tried to read %zu bytes but only read %zu", addr,
         curr_break_op.size(), bytes_read);
   }
-  const auto &saved = it->second.saved_opcodes;
+  const auto &saved = bkpt.saved_opcodes;
   // Make sure the breakpoint opcode exists at this address
-  if (llvm::ArrayRef(curr_break_op) != it->second.breakpoint_opcodes) {
-    if (curr_break_op != it->second.saved_opcodes)
+  if (llvm::ArrayRef(curr_break_op) != bkpt.breakpoint_opcodes) {
+    if (curr_break_op != bkpt.saved_opcodes)
       return Status::FromErrorString(
           "Original breakpoint trap is no longer in memory.");
     LLDB_LOG(log,
@@ -418,7 +425,6 @@ Status 
NativeProcessProtocol::RemoveSoftwareBreakpoint(lldb::addr_t addr) {
                llvm::make_range(saved.begin(), saved.end()));
   }
 
-  m_software_breakpoints.erase(it);
   return Status();
 }
 

diff  --git a/lldb/unittests/Host/NativeProcessProtocolTest.cpp 
b/lldb/unittests/Host/NativeProcessProtocolTest.cpp
index a48e67c9213da..91c4fd69d6e54 100644
--- a/lldb/unittests/Host/NativeProcessProtocolTest.cpp
+++ b/lldb/unittests/Host/NativeProcessProtocolTest.cpp
@@ -73,6 +73,97 @@ TEST(NativeProcessProtocolTest, SetBreakpointFailVerify) {
                     llvm::Failed());
 }
 
+TEST(NativeProcessProtocolTest, RemoveSoftwareBreakpoint) {
+  NiceMock<MockDelegate> DummyDelegate;
+  MockProcess<NativeProcessProtocol> Process(DummyDelegate,
+                                             ArchSpec("x86_64-pc-linux"));
+  auto Trap = cantFail(Process.GetSoftwareBreakpointTrapOpcode(1));
+  auto Original = std::vector<uint8_t>{0xbb};
+
+  // Set up a breakpoint.
+  {
+    InSequence S;
+    EXPECT_CALL(Process, ReadMemory(0x47, 1))
+        .WillOnce(Return(ByMove(Original)));
+    EXPECT_CALL(Process, WriteMemory(0x47, Trap)).WillOnce(Return(ByMove(1)));
+    EXPECT_CALL(Process, ReadMemory(0x47, 1)).WillOnce(Return(ByMove(Trap)));
+    EXPECT_THAT_ERROR(Process.SetBreakpoint(0x47, 0, false).ToError(),
+                      llvm::Succeeded());
+  }
+
+  // Remove the breakpoint for the first time. This should remove the 
breakpoint
+  // from m_software_breakpoints.
+  //
+  // Should succeed.
+  {
+    InSequence S;
+    EXPECT_CALL(Process, ReadMemory(0x47, 1)).WillOnce(Return(ByMove(Trap)));
+    EXPECT_CALL(Process, WriteMemory(0x47, llvm::ArrayRef(Original)))
+        .WillOnce(Return(ByMove(1)));
+    EXPECT_CALL(Process, ReadMemory(0x47, 1))
+        .WillOnce(Return(ByMove(Original)));
+    EXPECT_THAT_ERROR(Process.RemoveBreakpoint(0x47, false).ToError(),
+                      llvm::Succeeded());
+  }
+
+  // Remove the breakpoint for the second time.
+  //
+  // Should fail. None of the ReadMemory() or WriteMemory() should be called,
+  // because the function should early return when seeing that the breakpoint
+  // isn't in m_software_breakpoints.
+  {
+    EXPECT_CALL(Process, ReadMemory(_, _)).Times(0);
+    EXPECT_CALL(Process, WriteMemory(_, _)).Times(0);
+    EXPECT_THAT_ERROR(Process.RemoveBreakpoint(0x47, false).ToError(),
+                      llvm::Failed());
+  }
+}
+
+TEST(NativeProcessProtocolTest, RemoveSoftwareBreakpointMemoryError) {
+  NiceMock<MockDelegate> DummyDelegate;
+  MockProcess<NativeProcessProtocol> Process(DummyDelegate,
+                                             ArchSpec("x86_64-pc-linux"));
+  auto Trap = cantFail(Process.GetSoftwareBreakpointTrapOpcode(1));
+  auto Original = std::vector<uint8_t>{0xbb};
+  auto SomethingElse = std::vector<uint8_t>{0xaa};
+
+  // Set up a breakpoint.
+  {
+    InSequence S;
+    EXPECT_CALL(Process, ReadMemory(0x47, 1))
+        .WillOnce(Return(ByMove(Original)));
+    EXPECT_CALL(Process, WriteMemory(0x47, Trap)).WillOnce(Return(ByMove(1)));
+    EXPECT_CALL(Process, ReadMemory(0x47, 1)).WillOnce(Return(ByMove(Trap)));
+    EXPECT_THAT_ERROR(Process.SetBreakpoint(0x47, 0, false).ToError(),
+                      llvm::Succeeded());
+  }
+
+  // Remove the breakpoint for the first time, with an unexpected value read by
+  // the first ReadMemory(). This should cause an early return, with the
+  // breakpoint removed from m_software_breakpoints.
+  //
+  // Should fail.
+  {
+    InSequence S;
+    EXPECT_CALL(Process, ReadMemory(0x47, 1))
+        .WillOnce(Return(ByMove(SomethingElse)));
+    EXPECT_THAT_ERROR(Process.RemoveBreakpoint(0x47, false).ToError(),
+                      llvm::Failed());
+  }
+
+  // Remove the breakpoint for the second time.
+  //
+  // Should fail. None of the ReadMemory() or WriteMemory() should be called,
+  // because the function should early return when seeing that the breakpoint
+  // isn't in m_software_breakpoints.
+  {
+    EXPECT_CALL(Process, ReadMemory(_, _)).Times(0);
+    EXPECT_CALL(Process, WriteMemory(_, _)).Times(0);
+    EXPECT_THAT_ERROR(Process.RemoveBreakpoint(0x47, false).ToError(),
+                      llvm::Failed());
+  }
+}
+
 TEST(NativeProcessProtocolTest, ReadMemoryWithoutTrap) {
   NiceMock<MockDelegate> DummyDelegate;
   MockProcess<NativeProcessProtocol> Process(DummyDelegate,
@@ -146,4 +237,4 @@ TEST(NativeProcessProtocolTest, 
ReadCStringFromMemory_CrossPageBoundary) {
                                                      bytes_read),
                        llvm::HasValue(llvm::StringRef("hello")));
   EXPECT_EQ(bytes_read, 6UL);
-}
\ No newline at end of file
+}


        
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to