On 10/09/2012 03:50 PM, Paolo Bonzini wrote: > Il 09/10/2012 15:21, Avi Kivity ha scritto: >> On 10/09/2012 03:11 PM, Paolo Bonzini wrote: >>>> But no, it's actually impossible. Hotplug may be triggered from a vcpu >>>> thread, which clearly it can't be stopped. >>> >>> Hotplug should always be asynchronous (because that's how hardware >>> works), so it should always be possible to delegate the actual work to a >>> non-VCPU thread. Or not? >> >> The actual device deletion can happen from a different thread, as long >> as you isolate the device before. That's part of the garbage collector >> idea. >> >> vcpu thread: >> rcu_read_lock >> lookup >> dispatch >> mmio handler >> isolate >> queue(delete_work) >> rcu_read_unlock >> >> worker thread: >> process queue >> delete_work >> synchronize_rcu() / stop_machine() >> acquire qemu lock >> delete object >> drop qemu lock >> >> Compared to the garbage collector idea, this drops fined-grained locking >> for the qdev tree, a significant advantage. But it still suffers from >> dispatching inside the rcu critical section, which is something we want >> to avoid. > > But we are not Linux, and I think the tradeoffs are different for RCU in > Linux vs. QEMU. > > For CPUs in the kernel, running user code is just one way to get things > done; QEMU threads are much more event driven, and their whole purpose > is to either run the guest or sleep, until "something happens" (VCPU > exit or readable fd). In other words, QEMU threads should be able to > stay most of the time in KVM_RUN or select() for any workload (to some > approximation).
If you're streaming data (the saturated iothread from that other thread) or live migrating or have a block job with fast storage, this isn't necessarily true. You could make sure each thread polls the rcu state periodically though. > Not just that: we do not need to minimize RCU critical sections, because > anyway we want to minimize the time spent in QEMU, period. > > So I believe that to some approximation, in QEMU we can completely > ignore everything else, and behave as if threads were always under > rcu_read_lock(), except if in KVM_RUN/select. KVM_RUN and select are > what Paul McKenney calls extended quiescent states, and in fact the > following mapping works: > > rcu_extended_quiesce_start() -> rcu_read_unlock(); > rcu_extended_quiesce_end() -> rcu_read_lock(); > rcu_read_lock/unlock() -> nop > > This in turn means that dispatching inside the RCU critical section is > not really bad. I believe you still cannot synchronize_rcu() while in an rcu critical section per the rcu documentation, even when lock/unlock map to nops. Of course we can violate that and it wouldn't know a thing, but I prefer to stick to the established pattern. -- error compiling committee.c: too many arguments to function