On Wed, 2023-02-01 at 01:46 +0100, Ilya Leoshkevich wrote: > Currently dying to one of the core_dump_signal()s deadlocks, because > dump_core_and_abort() calls start_exclusive() two times: first via > stop_all_tasks(), and then via preexit_cleanup() -> > qemu_plugin_user_exit(). > > There are a number of ways to solve this: resume after dumping core; > check cpu_in_exclusive_context() in qemu_plugin_user_exit(); or make > {start,end}_exclusive() recursive. Pick the last option, since it's > the most straightforward one. > > Fixes: da91c1920242 ("linux-user: Clean up when exiting due to a > signal") > Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
Hi, I noticed that fork()ed CPUs start with in_exclusive_context set (in this patch it is renamed to exclusive_context_count, but the point stands). That was not important before, since only pending_cpus decided what happens in start_exclusive()/end_exclusive(). Now that exclusive_context_count is also important, we need something like: --- a/linux-user/main.c +++ b/linux-user/main.c @@ -161,13 +161,15 @@ void fork_end(int child) } qemu_init_cpu_list(); gdbserver_fork(thread_cpu); - /* qemu_init_cpu_list() takes care of reinitializing the - * exclusive state, so we don't need to end_exclusive() here. - */ } else { cpu_list_unlock(); - end_exclusive(); } + /* + * qemu_init_cpu_list() reinitialized the child exclusive state, but we + * also need to keep current_cpu consistent, so call end_exclusive() for + * both child and parent. + */ + end_exclusive(); } __thread CPUState *thread_cpu; diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 1f8c10f8ef9..70fad4bed01 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -6776,6 +6776,7 @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp, cpu_clone_regs_parent(env, flags); fork_end(0); } + g_assert(!cpu_in_exclusive_context(cpu)); } return ret; } I can include this in v2, if the overall recursive lock approach is considered appropriate. Best regards, Ilya