Re: Problem with iotest 233

Eric Blake Thu, 27 Feb 2025 12:30:23 -0800

On Wed, Feb 26, 2025 at 09:55:18AM +0100, Thomas Huth wrote:
> > > Though, that does not look like the thread from the simpletrace, but
> > > the the QEMU RCU thread instead ... so no clue where that writer
> > > thread might have gone...
> > 
> > OK, I think I now understood the problem: qemu-nbd is calling
> > trace_init_backends() first, which creates the simpletrace threads and
> > installs the atexit() handler. Then it is calling fork() since the test
> > uses the --fork command line option. But fork() does not clone the
> > simpletrace thread into the new process, only the main thread (see
> > man-page of fork, the new process starts single-threaded). So when the
> > new child process exits, the exit handler calls the simple trace flush
> > function which tries to wait for a thread that has never been created in
> > that process.


That definitely explains the symptoms.

> > 
> > The test works when I move the trace_init_backends() behind the fork()
> > in the main function... but I am not sure if we would miss some logs
> > this way, so I don't know whether that's the right solution. Could a
> > qemu-nbd expert please have a look at this?

I'm also thinking about ways to avoid it.

> 
> After pondering about it for a while, maybe the best solution is to handle
> it within the simpletrace backend itself, by using pthread_atfork() :
> 
>  https://lore.kernel.org/qemu-devel/20250226085015.1143991-1-th...@redhat.com/

pthread_atfork() is an odd function - POSIX itself says it is
unreliable, because there is NO sane way you can possibly know every
action that any library you call into that might possibly need
protection before fork.  That doesn't mean we can't try it, just that
we can't expect it to solve every fork-related problem we might
encounter.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: Problem with iotest 233

Reply via email to