On 15/07/2015 14:40, Jason J. Herne wrote: >>> I'm not sure how callbacks can pile up here. If the vcpus are >>> running then their thread's will execute the callbacks. If they >>> are not running then the use of QEMU_CLOCK_VIRTUAL_RT will >>> prevent the callbacks from stacking because the timer is not >>> running, right? >> >> Couldn't the iothread starve the VCPUs? They need to take the >> iothread lock in order to process the callbacks. > > Yes, I can see the possibility here. I'm not sure what to do about > it though. > > Maybe this is wishful thinking :) But if the iothread lock cannot be > acquired then the cpu cannot run thereby preventing the guest from > changing a ton of pages. This will have the effect of indirectly > throttling the guest which will allow us to advance to the non-live > phase of migration rather quickly.
Makes sense. On the other hand this wouldn't prevent callbacks from piling up for a short time because... > And again, if we are starving on > the iothread lock then the guest vcpus are not executing and > QEMU_CLOCK_VIRTUAL_RT is not ticking, right? ... you are talking about stolen time, and QEMU_CLOCK_VIRTUAL_RT does count stolen time (stolen time is different for each VCPU, so you would have a different clock for each VCPU). QEMU_CLOCK_VIRTUAL and QEMU_CLOCK_VIRTUAL_RT(*) only pause across stop/cont. (By the way, the two are the same with KVM). However, something like if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) { async_run_on_cpu(cpu, cpu_throttle_thread, NULL); } ... atomic_set(&cpu->throttle_thread_scheduled, 0); g_usleep(...); should be enough. You'd still have many timers that could starve the VCPUs but, as you pointed out, in that case migration would hopefully finish pretty fast. Paolo