Ludovic Courtès <l...@gnu.org> skribis: > Another possibility is lockup: one of the relevant fibers is either gone > or stuck in ‘put-message’ or ‘get-message’. > > I did two things: > > b9a37f3 shepherd: Make signal handling fiber an essential task. > 8ae2780 service: Do not attempt to restart transient services. > > Commit 8ae2780 fixes a bug whereby ‘herd restart’ could end up > attempting to restart a transient service, which would lock up the > calling fiber because the service’s controlling fiber would first > receive the 'terminate message, so it would return and nobody would be > reading further messages send on its channel. > > Commit b9a37f3 will allows us to ensure that the signal-handling fiber > never exits (and we’ll get a trace in the log if it tries to).
Apparently these commits, which made it in 0.10.1 months ago, fixed this particular bug. Closing! Ludo’.