During we waited on qemu_pause_cond the bql was unlocked, the vcpu's state may has been changed by other thread, so we must request the pause state on all vcpus again.
For example: Both main loop thread and vCPU thread are allowed to call pause_all_vcpus(), and in general resume_all_vcpus() is called after it. Thus there is possibility that during thread T1 waits on qemu_pause_cond, other thread has called pause_all_vcpus() and resume_all_vcpus(), then thread T1 will stuck, because the condition all_vcpus_paused() is always false. PS: I hit this bug when I test the RFC patch of ARM vCPU hotplug feature. This patch is not verified!!! just for RFC. Signed-off-by: Keqian Zhu <zhukeqi...@huawei.com> --- system/cpus.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/system/cpus.c b/system/cpus.c index 68d161d96b..24ac8085cd 100644 --- a/system/cpus.c +++ b/system/cpus.c @@ -571,12 +571,14 @@ static bool all_vcpus_paused(void) return true; } -void pause_all_vcpus(void) +static void request_pause_all_vcpus(void) { CPUState *cpu; - qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false); CPU_FOREACH(cpu) { + if (cpu->stopped) { + continue; + } if (qemu_cpu_is_self(cpu)) { qemu_cpu_stop(cpu, true); } else { @@ -584,6 +586,14 @@ void pause_all_vcpus(void) qemu_cpu_kick(cpu); } } +} + +void pause_all_vcpus(void) +{ + CPUState *cpu; + + qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false); + request_pause_all_vcpus(); /* We need to drop the replay_lock so any vCPU threads woken up * can finish their replay tasks @@ -592,9 +602,11 @@ void pause_all_vcpus(void) while (!all_vcpus_paused()) { qemu_cond_wait(&qemu_pause_cond, &bql); - CPU_FOREACH(cpu) { - qemu_cpu_kick(cpu); - } + /* During we waited on qemu_pause_cond the bql was unlocked, + * the vcpu's state may has been changed by other thread, so + * we must request the pause state on all vcpus again. + */ + request_pause_all_vcpus(); } bql_unlock(); -- 2.36.1