On Wed 15. Jan 2025 at 20:05, Paolo Bonzini <pbonz...@redhat.com> wrote:

> On 1/12/25 22:26, Phil Dennis-Jordan wrote:
> > By changing the way the main QEMU event loop is invoked, I inadvertently
> > changed the BQL status of exit notifiers: some of them implicitly
> > assumed they would be called with the BQL held; the BQL is however
> > not held during the exit(status) call in qemu_default_main().
> >
> > Instead of attempting to ensuring we always call exit() from the BQL -
> > including any transitive calls - this change adds a BQL lock guard to
> > qemu_run_exit_notifiers, ensuring the BQL will always be held in the
> > exit notifiers.
> >
> > Additionally, the BQL promise is now documented at the
> > qemu_{add,remove}_exit_notifier() declarations.
> >
> > Fixes: f5ab12caba4f ("ui & main loop: Redesign of system-specific main
> > thread event handling")
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2771
> > Reported-by: David Woodhouse <dw...@infradead.org>
> > Signed-off-by: Phil Dennis-Jordan <p...@philjordan.eu>
>
> I'm worried that this breaks for exit() calls that happen within a
> BQL-taken area (for example, anything that uses error_fatal) due to...
>
> void bql_lock_impl(const char *file, int line)
> {
>      QemuMutexLockFunc bql_lock_fn = qatomic_read(&bql_mutex_lock_func);
>
>      g_assert(!bql_locked()); // <--- this
>      bql_lock_fn(&bql, file, line);
>      set_bql_locked(true);
> }
>

BQL_LOCK_GUARD expands to a call to bql_auto_lock(), which in turn defends
against recursive locking by checking bql_locked().

https://gitlab.com/qemu-project/qemu/-/blob/master/include/qemu/main-loop.h#L377

I think that should make it safe?

The only safety issue I can imagine is that exit() is called in a thread
where the BQL is not held, but a BQL-holding thread is waiting for that
thread. But I’m not sure such a pattern exists in QEMU though, and it would
have triggered the assertion in the original code. (before my patch causing
the regression was applied)

>
>
>

Reply via email to