On Wed, Feb 26, 2025 at 09:53:53AM +0000, Daniel P. Berrangé wrote: > On Wed, Feb 26, 2025 at 10:38:56AM +0100, Thomas Huth wrote: > > On 26/02/2025 10.15, Daniel P. Berrangé wrote: > > > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote: > > > > When compiling QEMU with --enable-trace-backends=simple , the > > > > iotest 233 is currently hanging. This happens because qemu-nbd > > > > calls trace_init_backends() first - which causes simpletrace to > > > > install its writer thread and the atexit() handler - before > > > > calling fork(). But the simpletrace writer thread is then only > > > > available in the parent process, not in the child process anymore. > > > > Thus when the child process exits, its atexit handler waits forever > > > > on the trace_empty_cond condition to be set by the non-existing > > > > writer thread, so the process never finishes. > > > > > > > > Fix it by installing a pthread_atfork() handler, too, which > > > > makes sure that the trace_writeout_enabled variable gets set > > > > to false again in the child process, so we can use it in the > > > > atexit() handler to check whether we still need to wait on the > > > > writer thread or not. > > > > > > > > Signed-off-by: Thomas Huth <th...@redhat.com> > > > > --- > > > > trace/simple.c | 17 ++++++++++++++++- > > > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/trace/simple.c b/trace/simple.c > > > > index c0aba00cb7f..269bbda69f1 100644 > > > > --- a/trace/simple.c > > > > +++ b/trace/simple.c > > > > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void) > > > > void st_flush_trace_buffer(void) > > > > { > > > > - flush_trace_file(true); > > > > + flush_trace_file(trace_writeout_enabled); > > > > +} > > > > + > > > > +#ifndef _WIN32 > > > > +static void trace_thread_atfork(void) > > > > +{ > > > > + /* > > > > + * If we fork, the writer thread does not exist in the child, so > > > > + * make sure to allow st_flush_trace_buffer() to clean up > > > > correctly. > > > > + */ > > > > + g_mutex_lock(&trace_lock); > > > > + trace_writeout_enabled = false; > > > > + g_cond_signal(&trace_empty_cond); > > > > + g_mutex_unlock(&trace_lock); > > > > } > > > > +#endif > > > > > > This doesn't seem right to me. This is being run in the child and while > > > it may avoid the hang when the child exits, surely it still leaves tracing > > > non-functional in the child as we're lacking the thread to write out the > > > trace data. > > > > Well, you cannot write to the same file from the parent and child at the > > same time, so one of both needs to be shut up AFAIU. And the simpletrace > > code cannot now which one of the two processes should be allowed to continue > > with the logging, so we either have to disable tracing in one of the two > > processes, or think of something completely different, e.g. using > > pthread_atfork(abort, NULL, NULL) to make people aware that they are not > > allowed to start tracing before calling fork()...? But in that case we still > > need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the > > trace backend before calling fork(). > > As precedent, in system/vl.c we delay trace_init() until after daemonizing > which is the simple way to avoid the worst of the danger.
That sounds good to me. Adding Daniel Henrique Barboza because he fixed a similar issue in commit 10b6ee1616f9 ("vl.c: do not execute trace_init_backends() before daemonizing"). Stefan
signature.asc
Description: PGP signature