Hi, Paolo! On Tue, Apr 04, 2023 at 03:32:38PM +0200, Paolo Bonzini wrote: > On 2/16/23 17:18, huang...@chinatelecom.cn wrote: > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > > index 9b26582655..47483cdfa0 100644 > > --- a/accel/kvm/kvm-all.c > > +++ b/accel/kvm/kvm-all.c > > @@ -685,6 +685,15 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, > > CPUState *cpu) > > uint32_t ring_size = s->kvm_dirty_ring_size; > > uint32_t count = 0, fetch = cpu->kvm_fetch_index; > > + /* > > + * It's possible that we race with vcpu creation code where the vcpu is > > + * put onto the vcpus list but not yet initialized the dirty ring > > + * structures. If so, skip it. > > + */ > > + if (!cpu->created) { > > + return 0; > > + } > > + > > Is there a lock that protects cpu->created? > > If you don't want to use a lock you need to use qatomic_load_acquire > together with > > diff --git a/softmmu/cpus.c b/softmmu/cpus.c > index fed20ffb5dd2..15b64e7f4592 100644 > --- a/softmmu/cpus.c > +++ b/softmmu/cpus.c > @@ -525,7 +525,7 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int ms) > /* signal CPU creation */ > void cpu_thread_signal_created(CPUState *cpu) > { > - cpu->created = true; > + qatomic_store_release(&cpu->created, true); > qemu_cond_signal(&qemu_cpu_cond); > }
Makes sense. When looking at such a possible race, I also found that when destroying the vcpu we may have another relevant issue, where we flip "vcpu->created" after destroying the vcpu. IIUC it means the same issue can occur when vcpu unplugged? Meanwhile I think the memory ordering trick won't play there, because firstly to do that we'll need to update created==false: - kvm_destroy_vcpu(cpu); cpu_thread_signal_destroyed(cpu); + kvm_destroy_vcpu(cpu); And even if we order the operations we still cannot assume the data is safe to access even if created==true. Maybe we'd better need (unfortunately) a per-vcpu mutex to protect both cases? Thanks, -- Peter Xu