On Fri, Mar 18, 2022 at 09:48:37AM +0100, Paolo Bonzini wrote: > Hi all, > > based on the previous discussions here is a comparison of the various > possibilities for implementing coroutine backends in QEMU and the > respective advantages and disadvantages. > > I'm adding a third possibility for stackless coroutines, which is to > use the LLVM/clang builtins. I believe that would still require a > source-to-source translator, but it would offload to the compiler the > complicated bits such as liveness analysis. > > 1) Stackful coroutines: > Advantages: > - no changes to current code > > Disadvantages: > - portability issues regarding shadow stacks (SafeStack, CET) > - portability/nonconformance issues regarding TLS > > Another possible advantage is that it allows using the same function for > both coroutine and non-coroutine context. I'm listing this separately > because I'm not sure that's desirable, as it prevents compile-time > checking of calls to coroutine_fn. Compile-time checking would be > possible using clang -fthread-safety if we forgo the ability to use the > same function in both scenarios. > > > 2) "Duff's device" stackless coroutines > Advantages:
- Supports gcc and clang > - no portability issues regarding both shadow stacks and TLS > - compiles to good old C code > - compile-time checking of "coroutine-only" but not awaitable functions > - debuggability: stack frames should be easy to inspect The user needs to understand how the coroutine runtime works in order to get a backtrace of a suspended coroutine. More likely a GDB Python script will be needed for this. > Disadvantages: > - complex source-to-source translator > - more complex build process > > > 3) C++20 stackless coroutines > Advantages: > - no portability issues regarding both shadow stacks and TLS > - no code to write outside QEMU > - simpler build process > > Disadvantages: > - requires a new compiler > - it's C++ - raises questions about C++ usage in QEMU, which seem to be controversial > - no compile-time checking of "coroutine-only" but not awaitable functions > > > 4) LLVM stackless coroutines > Advantages: > - no portability issues regarding both shadow stacks and TLS > - no code to write outside QEMU > > Disadvantages: > - relatively simple source-to-source translator > - more complex build process > - requires a new compiler and doesn't support GCC > > > Note that (2) would still have a build dependency on libclang. > However the code generation could still be done with GCC and with > any compiler version. > > I'll also put it in a table, though I understand that some choices > here might be debatable: > > stackful Duff's device C++20 > LLVM > ============================================================================================== > Code to write/maintain ++ [1] --- +++ > - [2] > Changes to existing code ++ [3] - -- > - > Community acceptance ++ ++ -- > ? > Code or PoC exists ++ + - > -- > ============================================================================================== > Portability -- ++ + > - > Debuggability - ++ ? > ? > Performance - ++ [4] ++ > ++ > > [1] I'm penalizing stackful coroutines here because the worse portability > has an impact on future maintainability too. > > [2] This is an educated guess. > > [3] If we decide to remove the possibility of using the same function for > both coroutine and non-coroutine context, the changes to existing code > would be the same as for Duff's device and LLVM coroutines. > > [4] Slightly worse than C++20 coroutines for the PoC, but that is mostly due > to implementation choices that are easy to change. > > > Stackful coroutines are obviously pretty good, or we wouldn't have used them. > They might be a local optimum though, as shown by the negative points in terms > of portability, debuggability and performance. > > Both Duff's device and LLVM would be more or less transparent to the part of > the community that doesn't care about the coroutines. The translator would > probably be write-and-forget (though I'm not sure about the API stability of > libclang, which would be a major factor), but it would still be a substantial > amount of work to commit to. I don't see a clear winner but here is my order of preference: 1. Stackful - the devil we know 2. Duff's device - a temporary (wasteful) step before native compiler support? 3. LLVM - actually not bad but requires dropping gcc support 4. C++20 - I worry adding C++ into the codebase will cause friction Ideally gcc and clang would support C coroutines natively, making the choice simple. Is it worth treating this as a long term project and working with LLVM/clang and gcc to add native C coroutine support to compilers? We still have stackful coroutines in the short term. Stefan
signature.asc
Description: PGP signature