On Mon, Sep 19, 2016 at 14:50:59 +0200, Paolo Bonzini wrote: (snip) > @@ -212,24 +216,75 @@ void end_exclusive(void) > /* Wait for exclusive ops to finish, and begin cpu execution. */ > void cpu_exec_start(CPUState *cpu) > { > - qemu_mutex_lock(&qemu_cpu_list_mutex); > - exclusive_idle(); > cpu->running = true; > - qemu_mutex_unlock(&qemu_cpu_list_mutex); > + > + /* Write cpu->running before reading pending_cpus. */ > + smp_mb(); > + > + /* 1. start_exclusive saw cpu->running == true and pending_cpus >= 1. > + * After taking the lock we'll see cpu->has_waiter == true and run---not > + * for long because start_exclusive kicked us. cpu_exec_end will > + * decrement pending_cpus and signal the waiter. > + * > + * 2. start_exclusive saw cpu->running == false but pending_cpus >= 1. > + * This includes the case when an exclusive item is running now. > + * Then we'll see cpu->has_waiter == false and wait for the item to > + * complete. > + * > + * 3. pending_cpus == 0. Then start_exclusive is definitely going to > + * see cpu->running == true, and it will kick the CPU. > + */ > + if (pending_cpus) {
I'd consider doing if (unlikely(pending_cpus)) { since the exclusive is a slow path and will be more so in the near future. > + qemu_mutex_lock(&qemu_cpu_list_mutex); > + if (!cpu->has_waiter) { > + /* Not counted in pending_cpus, let the exclusive item > + * run. Since we have the lock, set cpu->running to true > + * while holding it instead of retrying. > + */ > + cpu->running = false; > + exclusive_idle(); > + /* Now pending_cpus is zero. */ > + cpu->running = true; > + } else { > + /* Counted in pending_cpus, go ahead. */ > + } > + qemu_mutex_unlock(&qemu_cpu_list_mutex); > + } > } > > /* Mark cpu as not executing, and release pending exclusive ops. */ > void cpu_exec_end(CPUState *cpu) > { > - qemu_mutex_lock(&qemu_cpu_list_mutex); > cpu->running = false; > - if (pending_cpus > 1) { > - pending_cpus--; > - if (pending_cpus == 1) { > - qemu_cond_signal(&exclusive_cond); > + > + /* Write cpu->running before reading pending_cpus. */ > + smp_mb(); > + > + /* 1. start_exclusive saw cpu->running == true. Then it will increment > + * pending_cpus and wait for exclusive_cond. After taking the lock > + * we'll see cpu->has_waiter == true. > + * > + * 2. start_exclusive saw cpu->running == false but here pending_cpus >= > 1. > + * This includes the case when an exclusive item started after setting > + * cpu->running to false and before we read pending_cpus. Then we'll see > + * cpu->has_waiter == false and not touch pending_cpus. The next call to > + * cpu_exec_start will run exclusive_idle if still necessary, thus > waiting > + * for the item to complete. > + * > + * 3. pending_cpus == 0. Then start_exclusive is definitely going to > + * see cpu->running == false, and it can ignore this CPU until the > + * next cpu_exec_start. > + */ > + if (pending_cpus) { ditto > + qemu_mutex_lock(&qemu_cpu_list_mutex); > + if (cpu->has_waiter) { > + cpu->has_waiter = false; > + if (--pending_cpus == 1) { > + qemu_cond_signal(&exclusive_cond); > + } (snip) Another suggestion is to consistently access pending_cpus atomically, since now we're accessing it with and without the CPU list mutex held. Thanks, Emilio