0x1eaf created this revision.
0x1eaf added reviewers: sammccall, ilya-biryukov, nridge.
0x1eaf added projects: clang, clang-tools-extra.
Herald added subscribers: usaxena95, kadircet, arphaman, javed.absar, mgorny.
0x1eaf requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.

Motivation:

At the moment it is hard to attribute a clangd crash to a specific request out 
of all in-flight requests that might be processed concurrently. So before we 
can act on production clangd crashes, we have to do quite some digging through 
the log tables populated by our in-house VSCode extension or sometimes even 
directly reach out to the affected developer. Having all the details needed to 
reproduce a crash printed alongside its stack trace has a potential to save us 
quite some time, that could better be spent on fixing the actual problems.

Implementation approach:

- introduce `ThreadCrashReporter` class that allows to set a temporary signal 
handler for the current thread
- follow RAII pattern to simplify printing context for crashes occurring within 
a particular scope
- hold `std::function` as a handler to allow capturing context to print
- set local `ThreadCrashReporter` within `JSONTransport::loop()` to print 
request JSON for main thread crashes, and in `ASTWorker::run()` to print the 
file paths, arguments and contents for worker thread crashes

`ThreadCrashReporter` currently allows only one active handler per thread, but 
the approach can be extended to support stacked handlers printing context 
incrementally.

Example output for main thread crashes:

  ...
  #15 0x00007f7ddc819493 __libc_start_main (/lib64/libc.so.6+0x23493)
  #16 0x000000000249775e _start 
(/home/emmablink/local/llvm-project/build/bin/clangd+0x249775e)
  Signalled while processing message:
  {"jsonrpc": "2.0", "method": "textDocument/didOpen", "params": 
{"textDocument": {"uri": "file:///home/emmablink/test.cpp", "languageId": 
"cpp", "version": 1, "text": "template <typename>\nclass Bar {\n  Bar<int> 
*variables_to_modify;\n  foo() {\n    for (auto *c : *variables_to_modify)\n    
  delete c;\n  }\n};\n"}}}


Example output for AST worker crashes:

  ...
  #41 0x00007fb18304c14a start_thread pthread_create.c:0:0
  #42 0x00007fb181bfcdc3 clone (/lib64/libc.so.6+0xfcdc3)
  Signalled during AST action:
  Filename: test.cpp
  Directory: /home/emmablink
  Command Line: /usr/bin/clang 
-resource-dir=/data/users/emmablink/llvm-project/build/lib/clang/14.0.0 -- 
/home/emmablink/test.cpp
  Version: 1
  Contents:
  template <typename>
  class Bar {
    Bar<int> *variables_to_modify;
    foo() {
      for (auto *c : *variables_to_modify)
        delete c;
    }
  };


Testing:

The unit test covers the thread-localitity and nesting aspects of 
`ThreadCrashReporter`. There might be way to set up a lit-based integration 
test that would spawn clangd, send a message to it, signal it immediately and 
check the standard output, but this might be prone to raceconditions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109772

Files:
  clang-tools-extra/clangd/JSONTransport.cpp
  clang-tools-extra/clangd/TUScheduler.cpp
  clang-tools-extra/clangd/support/CMakeLists.txt
  clang-tools-extra/clangd/support/ThreadCrashReporter.cpp
  clang-tools-extra/clangd/support/ThreadCrashReporter.h
  clang-tools-extra/clangd/tool/ClangdMain.cpp
  clang-tools-extra/clangd/unittests/CMakeLists.txt
  clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp

Index: clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp
===================================================================
--- /dev/null
+++ clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp
@@ -0,0 +1,71 @@
+///===- ThreadCrashReporterTests.cpp - Thread local signal handling tests -===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "support/ThreadCrashReporter.h"
+#include "support/Threading.h"
+#include "llvm/Support/Signals.h"
+#include "gtest/gtest.h"
+#include <csignal>
+#include <string>
+
+namespace clang {
+namespace clangd {
+
+namespace {
+
+static void infoHandler() { ThreadCrashReporter::runCrashHandlers(nullptr); }
+
+TEST(ThreadCrashReporterTest, All) {
+  llvm::sys::SetInfoSignalFunction(&infoHandler);
+  AsyncTaskRunner Runner;
+  auto SignalCurrentThread = []() { raise(SIGUSR1); };
+  auto SignalAnotherThread = [&]() {
+    Runner.runAsync("signal another thread", SignalCurrentThread);
+    Runner.wait();
+  };
+
+  bool Called;
+  {
+    ThreadCrashReporter ScopedReporter([&Called]() { Called = true; });
+    // Check handler gets called when a signal gets delivered to the current
+    // thread.
+    Called = false;
+    SignalCurrentThread();
+    EXPECT_TRUE(Called);
+
+    // Check handler does not get called when another thread gets signalled.
+    Called = false;
+    SignalAnotherThread();
+    EXPECT_FALSE(Called);
+  }
+  // Check handler does not get called when the reporter object goes out of
+  // scope.
+  Called = false;
+  SignalCurrentThread();
+  EXPECT_FALSE(Called);
+
+  std::string Order = "";
+  {
+    ThreadCrashReporter ScopedReporter([&Order] { Order.push_back('a'); });
+    {
+      ThreadCrashReporter ScopedReporter([&Order] { Order.push_back('b'); });
+      SignalCurrentThread();
+    }
+    // Check that handlers are called in FIFO order.
+    EXPECT_EQ(Order, "ab");
+
+    // Check that current handler is the only one after the nested scope is
+    // over.
+    SignalCurrentThread();
+    EXPECT_EQ(Order, "aba");
+  }
+}
+
+} // namespace
+} // namespace clangd
+} // namespace clang
Index: clang-tools-extra/clangd/unittests/CMakeLists.txt
===================================================================
--- clang-tools-extra/clangd/unittests/CMakeLists.txt
+++ clang-tools-extra/clangd/unittests/CMakeLists.txt
@@ -88,6 +88,7 @@
   TestIndex.cpp
   TestTU.cpp
   TestWorkspace.cpp
+  ThreadCrashReporterTests.cpp
   TidyProviderTests.cpp
   TypeHierarchyTests.cpp
   URITests.cpp
Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===================================================================
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -24,6 +24,7 @@
 #include "refactor/Rename.h"
 #include "support/Path.h"
 #include "support/Shutdown.h"
+#include "support/ThreadCrashReporter.h"
 #include "support/ThreadsafeFS.h"
 #include "support/Trace.h"
 #include "clang/Format/Format.h"
@@ -679,6 +680,7 @@
 
   llvm::InitializeAllTargetInfos();
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
+  llvm::sys::AddSignalHandler(&ThreadCrashReporter::runCrashHandlers, nullptr);
   llvm::sys::SetInterruptFunction(&requestShutdown);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
     OS << versionString() << "\n"
Index: clang-tools-extra/clangd/support/ThreadCrashReporter.h
===================================================================
--- /dev/null
+++ clang-tools-extra/clangd/support/ThreadCrashReporter.h
@@ -0,0 +1,48 @@
+//===--- ThreadCrashReporter.h - Thread local signal handling ----*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_SUPPORT_THREAD_CRASH_REPORTER_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_SUPPORT_THREAD_CRASH_REPORTER_H
+
+#include <functional>
+
+namespace clang {
+namespace clangd {
+
+/// Allows setting per-thread abort/kill signal callbacks, to print additional
+/// information about the crash depending on which thread got signalled.
+class ThreadCrashReporter {
+public:
+  using SignalCallback = std::function<void(void)>;
+
+  /// Copies the std::function and sets the copy as current thread-local
+  /// callback. Asserts if the current thread's callback is already set.
+  ThreadCrashReporter(const SignalCallback &ThreadLocalCallback);
+  /// Resets the currrent thread's callback to nullptr.
+  ~ThreadCrashReporter();
+
+  /// Moves are disabled to ease nesting and escaping considerations.
+  ThreadCrashReporter(ThreadCrashReporter &&RHS) = delete;
+  ThreadCrashReporter(const ThreadCrashReporter &) = delete;
+  ThreadCrashReporter &operator=(ThreadCrashReporter &&) = delete;
+  ThreadCrashReporter &operator=(const ThreadCrashReporter &) = delete;
+
+  /// Callback to install via sys::AddSignalHandler(), the argument is ignored.
+  /// Any signal filtering is the responsibility of the caller.
+  static void runCrashHandlers(void *);
+
+private:
+  SignalCallback Callback; // nullptr when invalidated
+  ThreadCrashReporter *Previous;
+  ThreadCrashReporter *Next;
+};
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/clangd/support/ThreadCrashReporter.cpp
===================================================================
--- /dev/null
+++ clang-tools-extra/clangd/support/ThreadCrashReporter.cpp
@@ -0,0 +1,65 @@
+//===--- ThreadCrashReporter.cpp - Thread local signal handling --*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "support/ThreadCrashReporter.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadLocal.h"
+
+namespace clang {
+namespace clangd {
+
+static thread_local ThreadCrashReporter *CurrentReporter;
+
+void ThreadCrashReporter::runCrashHandlers(void *) {
+  // No signal handling is done here on top of what AddSignalHandler() does:
+  // on Windows the signal handling is implmented via
+  // SetUnhandledExceptionFilter() which is thread-directed, and on Unix
+  // platforms the handlers are only called for KillSigs out of which only
+  // SIGQUIT seems to be process-directed and would be delivered to any thread
+  // that is not blocking it, but if the thread it gets delivered to has a
+  // ThreadCrashReporter installed during the interrupt — it seems reasonable to
+  // let it run and print the thread's context information.
+
+  // Traverse to the top of the reporter stack.
+  ThreadCrashReporter *Reporter = CurrentReporter;
+  while (Reporter && Reporter->Previous) {
+    Reporter = Reporter->Previous;
+  }
+  // Call the reporters in FIFO order.
+  while (Reporter) {
+    Reporter->Callback();
+    Reporter = Reporter->Next;
+  }
+}
+
+ThreadCrashReporter::ThreadCrashReporter(
+    const SignalCallback &ThreadLocalCallback)
+    : Callback(ThreadLocalCallback), Previous(nullptr), Next(nullptr) {
+  if (CurrentReporter) {
+    CurrentReporter->Next = this;
+  }
+  this->Previous = CurrentReporter;
+  CurrentReporter = this;
+  // Don't reorder subsequent operations: whatever comes after might crash and
+  // we want the the crash handler to see the reporter values we just set.
+  std::atomic_signal_fence(std::memory_order_seq_cst);
+}
+
+ThreadCrashReporter::~ThreadCrashReporter() {
+  assert(CurrentReporter == this);
+  if (this->Previous) {
+    this->Previous->Next = nullptr;
+  }
+  CurrentReporter = this->Previous;
+  // Don't reorder subsequent operations: whatever comes after might crash and
+  // we want the the crash handler to see the reporter values we just set.
+  std::atomic_signal_fence(std::memory_order_seq_cst);
+}
+
+} // namespace clangd
+} // namespace clang
Index: clang-tools-extra/clangd/support/CMakeLists.txt
===================================================================
--- clang-tools-extra/clangd/support/CMakeLists.txt
+++ clang-tools-extra/clangd/support/CMakeLists.txt
@@ -24,6 +24,7 @@
   MemoryTree.cpp
   Path.cpp
   Shutdown.cpp
+  ThreadCrashReporter.cpp
   Threading.cpp
   ThreadsafeFS.cpp
   Trace.cpp
Index: clang-tools-extra/clangd/TUScheduler.cpp
===================================================================
--- clang-tools-extra/clangd/TUScheduler.cpp
+++ clang-tools-extra/clangd/TUScheduler.cpp
@@ -58,6 +58,7 @@
 #include "support/Logger.h"
 #include "support/MemoryTree.h"
 #include "support/Path.h"
+#include "support/ThreadCrashReporter.h"
 #include "support/Threading.h"
 #include "support/Trace.h"
 #include "clang/Frontend/CompilerInvocation.h"
@@ -76,6 +77,7 @@
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/Threading.h"
+#include "llvm/Support/raw_ostream.h"
 #include <algorithm>
 #include <atomic>
 #include <chrono>
@@ -551,6 +553,9 @@
   TUScheduler::FileStats stats() const;
   bool isASTCached() const;
 
+  /// For calling from ThreadCrashReporter callback.
+  void dumpCurrentRequest(llvm::raw_ostream &OS) const;
+
 private:
   // Details of an update request that are relevant to scheduling.
   struct UpdateType {
@@ -1281,6 +1286,8 @@
         Status.ASTActivity.Name = CurrentRequest->Name;
       });
       WithContext WithProvidedContext(ContextProvider(FileName));
+      ThreadCrashReporter ScopedReporter(
+          [this]() { dumpCurrentRequest(llvm::errs()); });
       CurrentRequest->Action();
     }
 
@@ -1371,6 +1378,28 @@
   llvm_unreachable("Unknown WantDiagnostics");
 }
 
+void ASTWorker::dumpCurrentRequest(llvm::raw_ostream &OS) const {
+  auto &ActionName = CurrentRequest ? CurrentRequest->Name : "";
+  OS << "Signalled during AST action: " << ActionName << "\n";
+  auto &Command = FileInputs.CompileCommand;
+  OS << "  Filename: " << Command.Filename << "\n";
+  OS << "  Directory: " << Command.Directory << "\n";
+  OS << "  Command Line:";
+  for (auto &Arg : Command.CommandLine) {
+    OS << " " << Arg;
+  }
+  OS << "\n";
+  OS << "  Version: " << FileInputs.Version << "\n";
+  // Avoid flooding the terminal with source code by default, but allow clients
+  // to opt in. Use an env var to preserve backwards compatibility of the
+  // command line interface, while allowing it to be set outside the clangd
+  // launch site for more flexibility.
+  if (getenv("CLANGD_CRASH_DUMP_SOURCE")) {
+    OS << "  Contents:\n";
+    OS << FileInputs.Contents << "\n";
+  }
+}
+
 bool ASTWorker::blockUntilIdle(Deadline Timeout) const {
   auto WaitUntilASTWorkerIsIdle = [&] {
     std::unique_lock<std::mutex> Lock(Mutex);
@@ -1606,6 +1635,8 @@
                Ctx = Context::current().derive(kFileBeingProcessed,
                                                std::string(File)),
                Action = std::move(Action), this]() mutable {
+    ThreadCrashReporter ScopedReporter(
+        [&Worker]() { Worker->dumpCurrentRequest(llvm::errs()); });
     std::shared_ptr<const PreambleData> Preamble;
     if (Consistency == PreambleConsistency::Stale) {
       // Wait until the preamble is built for the first time, if preamble
Index: clang-tools-extra/clangd/JSONTransport.cpp
===================================================================
--- clang-tools-extra/clangd/JSONTransport.cpp
+++ clang-tools-extra/clangd/JSONTransport.cpp
@@ -10,9 +10,11 @@
 #include "support/Cancellation.h"
 #include "support/Logger.h"
 #include "support/Shutdown.h"
+#include "support/ThreadCrashReporter.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/Support/Errno.h"
 #include "llvm/Support/Error.h"
+#include "llvm/Support/Threading.h"
 #include <system_error>
 
 namespace clang {
@@ -109,6 +111,11 @@
         return llvm::errorCodeToError(
             std::error_code(errno, std::system_category()));
       if (readRawMessage(JSON)) {
+        ThreadCrashReporter ScopedReporter([&JSON]() {
+          auto &OS = llvm::errs();
+          OS << "Signalled while processing message:\n";
+          OS << JSON << "\n";
+        });
         if (auto Doc = llvm::json::parse(JSON)) {
           vlog(Pretty ? "<<< {0:2}\n" : "<<< {0}\n", *Doc);
           if (!handleMessage(std::move(*Doc), Handler))
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to