llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-coroutines Author: Adrian Vogelsgesang (vogelsgesang) <details> <summary>Changes</summary> This commit is a major overhaul of the documentation on debugging C++ coroutines with the following goals: * Make it more accessible to casual C++ programmers, i.e. non-toolchain developers. Move the low-level details around ABI further down, and instead start with real-life examples and copy-paste-friendly code, first. * Cover LLDB in addition to GCC. Provide copy-pasteable scripts for LLDB and not only GCC. * Cover additional topics, such as: * single-stepping into a coroutine * using `__builtin_return_address` for tracking suspension points (inspired by Folly's blog series on coroutine debugging) * Document LLDB's support for devirtualization of `std::coroutine_handle`, both from an end user perspective as well as its internal implementation --- Patch is 64.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142651.diff 4 Files Affected: - (modified) clang/docs/DebuggingCoroutines.rst (+844-459) - (added) clang/docs/coro-async-task-continuations.png () - (added) clang/docs/coro-generator-suspended.png () - (added) clang/docs/coro-generator-variables.png () ``````````diff diff --git a/clang/docs/DebuggingCoroutines.rst b/clang/docs/DebuggingCoroutines.rst index 80df321340724..b33e2cda91a30 100644 --- a/clang/docs/DebuggingCoroutines.rst +++ b/clang/docs/DebuggingCoroutines.rst @@ -8,470 +8,966 @@ Debugging C++ Coroutines Introduction ============ -For performance and other architectural reasons, the C++ Coroutines feature in -the Clang compiler is implemented in two parts of the compiler. Semantic -analysis is performed in Clang, and Coroutine construction and optimization -takes place in the LLVM middle-end. +Coroutines in C++ were introduced in C++20, and their user experience for +debugging them can still be challenging. This document guides you how to most +efficiently debug coroutines and how to navigate existing shortcomings in +debuggers and compilers. + +Coroutines are generally used either as generators or for asynchronous +programming. In this document, we will discuss both use cases. Even if you are +using coroutines for asynchronous programming, you should still read the +generators section, as it will introduce foundational debugging techniques also +applicable to the debugging of asynchronous programming. + +Both compilers (clang, gcc, ...) and debuggers (lldb, gdb, ...) are +still improving their support for coroutines. As such, we recommend to use the +latest available version of your toolchain. + +This document focuses on clang and lldb. The screenshots show +[lldb-dap](https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.lldb-dap) +in combination with VS Code. The same techniques can also be used in other +IDEs. + +Debugging clang-compiled binaries with gdb is possible, but requires more +scripting. This guide comes with a basic GDB script for coroutine debugging. + +This guide will first showcase the more polished, bleeding-edge experience, but +will also show you how to debug coroutines with older toolchains. In general, +the older your toolchain, the deeper you will have to dive into the +implementation details of coroutines (such as their ABI). The further down in +this document, the more low-level, technical the content will become. If you +are on an up-to-date toolchain, you will hopefully be able to stop reading +earlier. + +Debugging generators +==================== + +The first major use case for coroutines in C++ are generators, i.e. functions +which can produce values via ``co_yield``. Values are produced lazily, +on-demand. For that purpose, every time a new value is requested the coroutine +gets resumed. As soon as it reaches a ``co_yield`` and thereby returns the +requested value, the coroutine is suspended again. + +This logic is encapsulated in a ``generator`` type similar to -However, this design forces us to generate insufficient debugging information. -Typically, the compiler generates debug information in the Clang frontend, as -debug information is highly language specific. However, this is not possible -for Coroutine frames because the frames are constructed in the LLVM middle-end. - -To mitigate this problem, the LLVM middle end attempts to generate some debug -information, which is unfortunately incomplete, since much of the language -specific information is missing in the middle end. +.. code-block:: c++ -This document describes how to use this debug information to better debug -coroutines. + // generator.hpp + #include <coroutine> -Terminology -=========== + // `generator` is a stripped down, minimal generator type. + template<typename T> + struct generator { + struct promise_type { + T current_value{}; -Due to the recent nature of C++20 Coroutines, the terminology used to describe -the concepts of Coroutines is not settled. This section defines a common, -understandable terminology to be used consistently throughout this document. + auto get_return_object() { + return std::coroutine_handle<promise_type>::from_promise(*this); + } + auto initial_suspend() { return std::suspend_always(); } + auto final_suspend() noexcept { return std::suspend_always(); } + auto return_void() { return std::suspend_always(); } + void unhandled_exception() { __builtin_unreachable(); } + auto yield_value(T v) { + current_value = v; + return std::suspend_always(); + } + }; -coroutine type --------------- + generator(std::coroutine_handle<promise_type> h) : hdl(h) { hdl.resume(); } + ~generator() { hdl.destroy(); } -A `coroutine function` is any function that contains any of the Coroutine -Keywords `co_await`, `co_yield`, or `co_return`. A `coroutine type` is a -possible return type of one of these `coroutine functions`. `Task` and -`Generator` are commonly referred to coroutine types. + generator<int>& operator++() { hdl.resume(); return *this; } // resume the coroutine + int operator*() const { return hdl.promise().current_value; } -coroutine ---------- + private: + std::coroutine_handle<promise_type> hdl; + }; -By technical definition, a `coroutine` is a suspendable function. However, -programmers typically use `coroutine` to refer to an individual instance. -For example: +We can then use this ``generator`` class to print the Fibonacci sequence: .. code-block:: c++ - std::vector<Task> Coros; // Task is a coroutine type. - for (int i = 0; i < 3; i++) - Coros.push_back(CoroTask()); // CoroTask is a coroutine function, which - // would return a coroutine type 'Task'. + #include "generator.hpp" + #include <iostream> -In practice, we typically say "`Coros` contains 3 coroutines" in the above -example, though this is not strictly correct. More technically, this should -say "`Coros` contains 3 coroutine instances" or "Coros contains 3 coroutine -objects." + generator<int> fibonacci() { + co_yield 0; + int prev = 0; + co_yield 1; + int current = 1; + while (true) { + int next = current + prev; + co_yield next; + prev = current; + current = next; + } + } -In this document, we follow the common practice of using `coroutine` to refer -to an individual `coroutine instance`, since the terms `coroutine instance` and -`coroutine object` aren't sufficiently defined in this case. + template<typename T> + void print10Elements(generator<T>& gen) { + for (unsigned i = 0; i < 10; ++i) { + std::cerr << *gen << "\n"; + ++gen; + } + } -coroutine frame ---------------- + int main() { + std::cerr << "Fibonacci sequence - here we go\n"; + generator<int> fib = fibonacci(); + for (unsigned i = 0; i < 5; ++i) { + ++fib; + } + print10Elements(fib); + } -The C++ Standard uses `coroutine state` to describe the allocated storage. In -the compiler, we use `coroutine frame` to describe the generated data structure -that contains the necessary information. +To compile this code, use ``clang++ --std=c++23 generator-example.cpp -g``. -The structure of coroutine frames -================================= +Breakpoints inside the generators +--------------------------------- -The structure of coroutine frames is defined as: +We can set breakpoints inside coroutines just as we set them in regular +functions. For VS Code, that means clicking next the line number in the editor. +In the ``lldb`` CLI or in ``gdb``, you can use ``b`` to set a breakpoint. -.. code-block:: c++ +Inspecting variables in a coroutine +----------------------------------- - struct { - void (*__r)(); // function pointer to the `resume` function - void (*__d)(); // function pointer to the `destroy` function - promise_type; // the corresponding `promise_type` - ... // Any other needed information - } +If you hit a breakpoint inside the ``fibonacci`` function, you should be able +to inspect all local variables (``prev```, ``current```, ``next``) just like in +a regular function. -In the debugger, the function's name is obtainable from the address of the -function. And the name of `resume` function is equal to the name of the -coroutine function. So the name of the coroutine is obtainable once the -address of the coroutine is known. +.. image:: ./coro-generator-variables.png -Print promise_type -================== +Note the two additional variables ``__promise`` and ``__coro_frame``. Those +show the internal state of the coroutine. They are not relevant for our +generator example, but will be relevant for asynchronous programming described +in the next section. -Every coroutine has a `promise_type`, which defines the behavior -for the corresponding coroutine. In other words, if two coroutines have the -same `promise_type`, they should behave in the same way. -To print a `promise_type` in a debugger when stopped at a breakpoint inside a -coroutine, printing the `promise_type` can be done by: +Stepping out of a coroutine +--------------------------- -.. parsed-literal:: +When single-stepping, you will notice that the debugger will leave the +``fibonacci`` function as soon as you hit a ``co_yield`` statement. You might +find yourself inside some standard library code. After stepping out of the +library code, you will be back in the ``main`` function. - print __promise +Stepping into a coroutine +------------------------- -It is also possible to print the `promise_type` of a coroutine from the address -of the coroutine frame. For example, if the address of a coroutine frame is -0x416eb0, and the type of the `promise_type` is `task::promise_type`, printing -the `promise_type` can be done by: +If you stop at ``++fib`` and try to step into the generator, you will first +find yourself inside ``operator++``. Stepping into the ``handle.resume()`` will +not work by default. -.. parsed-literal:: +This is because lldb does not step int functions from the standard library by +default. To make this work, you first need to run ``settings set +target.process.thread.step-avoid-regexp ""``. You can do so from the "Debug +Console" towards the bottom of the screen. With that setting change, you can +step through ``coroutine_handle::resume`` and into your generator. - print (task::promise_type)*(0x416eb0+0x10) +You might find yourself at the top of the coroutine at first, instead of at +your previous suspension point. In that case, single-step and you will arrive +at the previously suspended ``co_yield`` statement. -This is possible because the `promise_type` is guaranteed by the ABI to be at a -16 bit offset from the coroutine frame. -Note that there is also an ABI independent method: +Inspecting a suspended coroutine +-------------------------------- -.. parsed-literal:: +The ``print10Elements`` function receives an opaque ``generator`` type. Let's +assume we are suspended at the ``++gen;`` line, and want to inspect the +generator and its internal state. - print std::coroutine_handle<task::promise_type>::from_address((void*)0x416eb0).promise() +To do so, we can simply look into the ``gen.hdl`` variable. LLDB comes with a +pretty printer for ``std::coroutine_handle`` which will show us the internal +state of the coroutine. For GDB, you will have to use the ``show-coro-frame`` +command provided by the :ref:`GDB Debugger Script`. -The functions `from_address(void*)` and `promise()` are often small enough to -be removed during optimization, so this method may not be possible. +.. image:: ./coro-generator-suspended.png -Print coroutine frames -====================== +We can see two function pointers ``resume`` and ``destroy``. These pointers +point to the resume / destroy functions. By inspecting those function pointers, +we can see that our ``generator`` is actually backed by our ``fibonacci`` +coroutine. When using VS Code + lldb-dap, you can Cmd+Click on the function +address (``0x555...`` in the screenshot) to directly jump to the function +definition backing your coroutine handle. -LLVM generates the debug information for the coroutine frame in the LLVM middle -end, which permits printing of the coroutine frame in the debugger. Much like -the `promise_type`, when stopped at a breakpoint inside a coroutine we can -print the coroutine frame by: +Next, we see the ``promise``. In our case, this reveals the current value of +our generator. -.. parsed-literal:: +The ``coro_frame`` member represents the internal state of the coroutine. It +contains our internal coroutine state ``prev``, ``current``, ``next``. +Furthermore, it contains many internal, compiler-specific members, which are +named based on their type. These represent temporary values which the compiler +decided to spill across suspension points, but which were not declared in our +original source code and hence have no proper user-provided name. - print __coro_frame +Tracking the exact suspension point +----------------------------------- +Among the compiler-generated members, the ``__coro_index`` is particularly +important. This member identifies the suspension point at which the coroutine +is currently suspended. -Just as printing the `promise_type` is possible from the coroutine address, -printing the details of the coroutine frame from an address is also possible: +However, it is non-trivial to map this number back to a source code location. +In simple cases, one might correctly guess the source code location. In more +complex cases, we can modify the C++ code to store additional information in +the promise type: -:: +.. code-block:: c++ - (gdb) # Get the address of coroutine frame - (gdb) print/x *0x418eb0 - $1 = 0x4019e0 - (gdb) # Get the linkage name for the coroutine - (gdb) x 0x4019e0 - 0x4019e0 <_ZL9coro_taski>: 0xe5894855 - (gdb) # Turn off the demangler temporarily to avoid the debugger misunderstanding the name. - (gdb) set demangle-style none - (gdb) # The coroutine frame type is 'linkage_name.coro_frame_ty' - (gdb) print ('_ZL9coro_taski.coro_frame_ty')*(0x418eb0) - $2 = {__resume_fn = 0x4019e0 <coro_task(int)>, __destroy_fn = 0x402000 <coro_task(int)>, __promise = {...}, ...} + // For all promise_types we need a new `line_number variable`: + class promise_type { + ... + void* _coro_return_address = nullptr; + }; -The above is possible because: + #include <source_location> -(1) The name of the debug type of the coroutine frame is the `linkage_name`, -plus the `.coro_frame_ty` suffix because each coroutine function shares the -same coroutine type. + // For all the awaiter types we need: + class awaiter { + ... + template <typename Promise> + __attribute__((noinline)) auto await_suspend(std::coroutine_handle<Promise> handle) { + ... + handle.promise()._coro_return_address = __builtin_return_address(0); + } + }; -(2) The coroutine function name is accessible from the address of the coroutine -frame. +This stores the return address of ``await_suspend`` within the promise. +Thereby, we can read it back from the promise of a suspended coroutine, and map +it to an exact source code location. For a complete example, see the ``task`` +type used below for asynchronous programming. -The above commands can be simplified by placing them in debug scripts. +Alternatively, we can modify the C++ code to store the line number in the +promise type. We can use a ``std::source_location`` to get the line number of +the await and store it inside the ``promise_type``. Since we can get the +promise of a suspended coroutine, we thereby get access to the line_number. -Examples to print coroutine frames ----------------------------------- +.. code-block:: c++ + + // For all the awaiter types we need: + class awaiter { + ... + template <typename Promise> + void await_suspend(std::coroutine_handle<Promise> handle, + std::source_location sl = std::source_location::current()) { + ... + handle.promise().line_number = sl.line(); + } + }; + +The downside of both approaches is that they come at the price of additional +runtime cost. In particular the second approach increases binary size, since it +requires additional ``std::source_location`` objects, and those source +locations are not stripped by split-dwarf. Whether the first approach is worth +the additional runtime cost is a trade-off you need to make yourself. + +Async stack traces +================== -The print examples below use the following definition: +Besides generators, the second common use case for coroutines in C++ is +asynchronous programming, usually involving libraries such as stdexec, folly, +cppcoro, boost::asio or similar libraries. Some of those libraries already +provide custom debugging support, so in addition to this guide, you might want +to check out their documentation. + +When using coroutines for asynchronous programming, your library usually +provides you some ``task`` type. This type usually looks similar to this: .. code-block:: c++ + // async-task-library.hpp #include <coroutine> - #include <iostream> + #include <utility> - struct task{ + struct task { struct promise_type { task get_return_object() { return std::coroutine_handle<promise_type>::from_promise(*this); } - std::suspend_always initial_suspend() { return {}; } - std::suspend_always final_suspend() noexcept { return {}; } - void return_void() noexcept {} + auto initial_suspend() { return std::suspend_always{}; } + void unhandled_exception() noexcept {} - int count = 0; - }; + auto final_suspend() noexcept { + struct FinalSuspend { + std::coroutine_handle<> continuation; + auto await_ready() noexcept { return false; } + auto await_suspend(std::coroutine_handle<> handle) noexcept { + return continuation; + } + void await_resume() noexcept {} + }; + return FinalSuspend{continuation}; + } - void resume() noexcept { - handle.resume(); - } + void return_value(int res) { result = res; } - task(std::coroutine_handle<promise_type> hdl) : handle(hdl) {} + std::coroutine_handle<> continuation = std::noop_coroutine(); + int result = 0; + #ifndef NDEBUG + void* _coro_suspension_point_addr = nullptr; + #endif + }; + + task(std::coroutine_handle<promise_type> handle) : handle(handle) {} ~task() { if (handle) handle.destroy(); } - std::coroutine_handle<> handle; - }; - - class await_counter : public std::suspend_always { - public: - template<class PromiseType> - void await_suspend(std::coroutine_handle<PromiseType> handle) noexcept { - handle.promise().count++; + struct Awaiter { + std::coroutine_handle<promise_type> handle; + auto await_ready() { return false; } + + template <typename P> + #ifndef NDEBUG + __attribute__((noinline)) + #endif + auto await_suspend(std::coroutine_handle<P> continuation) { + handle.promise().continuation = continuation; + #ifndef NDEBUG + continuation.promise()._coro_suspension_point_addr = __builtin_return_address(0); + #endif + return handle; } + int await_resume() { + return handle.promise().result; + } + }; + + auto operator co_await() { + return Awaiter{handle}; + } + + int syncStart() { + handle.resume(); + return handle.promise().result; + } + + private: + std::coroutine_handle<promise_type> handle; }; - static task coro_task(int v) { - int a = v; - co_await await_counter{}; - a++; - std::cout << a << "\n"; - a++; - std::cout << a << "\n"; - a++; - std::cout << a << "\n"; - co_await await_counter{}; - a++; - std::cout << a << "\n"; - a++; - std::cout << a << "\n"; +Note how the ``task::promise_type`` has a member variable +``std::coroutine_handle<> continuation``. This is the handle of the coroutine +that will be resumed when the current coroutine is finished executing (see +``final_suspend``). In a sense, this is the "return address" of the coroutine. +It is as soon as the caller coroutine ``co_await`` on the called coroutine in +``operator co_await``. + +The result value is returned via the ``int result`` member. It is written in +``return_value`` and read by ``Awaiter::await_resume``. Usually, the result +type of a task is a template argument. For simplicity's sake, we hard-coded the +``int`` type in this example. + +Stack traces of in-fli... [truncated] `````````` </details> https://github.com/llvm/llvm-project/pull/142651 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits