On Tue, Jan 25, 2011 at 7:18 PM, Anthony Liguori <aligu...@linux.vnet.ibm.com> wrote: > On 01/25/2011 03:49 AM, Stefan Hajnoczi wrote: >> >> On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi<stefa...@gmail.com> >> wrote: >> >>> >>> On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf<kw...@redhat.com> wrote: >>> >>>> >>>> Am 24.01.2011 20:47, schrieb Michael S. Tsirkin: >>>> >>>>> >>>>> On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote: >>>>> >>>>>> >>>>>> Am 24.01.2011 20:36, schrieb Michael S. Tsirkin: >>>>>> >>>>>>> >>>>>>> On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote: >>>>>>> >>>>>>>> >>>>>>>> Am 12.12.2010 16:02, schrieb Stefan Hajnoczi: >>>>>>>> >>>>>>>>> >>>>>>>>> Virtqueue notify is currently handled synchronously in userspace >>>>>>>>> virtio. This >>>>>>>>> prevents the vcpu from executing guest code while hardware >>>>>>>>> emulation code >>>>>>>>> handles the notify. >>>>>>>>> >>>>>>>>> On systems that support KVM, the ioeventfd mechanism can be used to >>>>>>>>> make >>>>>>>>> virtqueue notify a lightweight exit by deferring hardware emulation >>>>>>>>> to the >>>>>>>>> iothread and allowing the VM to continue execution. This model is >>>>>>>>> similar to >>>>>>>>> how vhost receives virtqueue notifies. >>>>>>>>> >>>>>>>>> The result of this change is improved performance for userspace >>>>>>>>> virtio devices. >>>>>>>>> Virtio-blk throughput increases especially for multithreaded >>>>>>>>> scenarios and >>>>>>>>> virtio-net transmit throughput increases substantially. >>>>>>>>> >>>>>>>>> Some virtio devices are known to have guest drivers which expect a >>>>>>>>> notify to be >>>>>>>>> processed synchronously and spin waiting for completion. Only >>>>>>>>> enable ioeventfd >>>>>>>>> for virtio-blk and virtio-net for now. >>>>>>>>> >>>>>>>>> Care must be taken not to interfere with vhost-net, which uses host >>>>>>>>> notifiers. If the set_host_notifier() API is used by a device >>>>>>>>> virtio-pci will disable virtio-ioeventfd and let the device deal >>>>>>>>> with >>>>>>>>> host notifiers as it wishes. >>>>>>>>> >>>>>>>>> After migration and on VM change state (running/paused) >>>>>>>>> virtio-ioeventfd >>>>>>>>> will enable/disable itself. >>>>>>>>> >>>>>>>>> * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd >>>>>>>>> * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd >>>>>>>>> * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd >>>>>>>>> * vm_change_state(running=0) -> disable virtio-ioeventfd >>>>>>>>> * vm_change_state(running=1) -> enable virtio-ioeventfd >>>>>>>>> >>>>>>>>> Signed-off-by: Stefan Hajnoczi<stefa...@linux.vnet.ibm.com> >>>>>>>>> >>>>>>>> >>>>>>>> On current git master I'm getting hangs when running iozone on a >>>>>>>> virtio-blk disk. "Hang" means that it's not responsive any more and >>>>>>>> has >>>>>>>> 100% CPU consumption. >>>>>>>> >>>>>>>> I bisected the problem to this patch. Any ideas? >>>>>>>> >>>>>>>> Kevin >>>>>>>> >>>>>>> >>>>>>> Does it help if you set ioeventfd=off on command line? >>>>>>> >>>>>> >>>>>> Yes, with ioeventfd=off it seems to work fine. >>>>>> >>>>>> Kevin >>>>>> >>>>> >>>>> Then it's the ioeventfd that is to blame. >>>>> Is it the io thread that consumes 100% CPU? >>>>> Or the vcpu thread? >>>>> >>>> >>>> I was building with the default options, i.e. there is no IO thread. >>>> >>>> Now I'm just running the test with IO threads enabled, and so far >>>> everything looks good. So I can only reproduce the problem with IO >>>> threads disabled. >>>> >>> >>> Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions >>> (relevant when --enable-io-thread is not used). I will take a look at >>> that again and see why we're spinning without checking for ioeventfd >>> completion. >>> >> >> Here's my understanding of --disable-io-thread. Added Anthony on CC, >> please correct me. >> >> When I/O thread is disabled our only thread runs guest code until an >> exit request is made. There are synchronous exit cases like a halt >> instruction or single step. There are also asynchronous exit cases >> when signal handlers use qemu_notify_event(), which does cpu_exit(), >> to set env->exit_request = 1 and unlink the current tb. >> > > Correct. > > Note that this is a problem today. If you have a tight loop in TCG and you > have nothing that would generate a signal (no pending disk I/O and no > periodic timer) then the main loop is starved.
Even with KVM we can spin inside the guest and get a softlockup due to the dynticks race condition shown above. In a CPU bound guest that's doing no I/O it's possible to go AWOL for extended periods of time. I can think of two solutions: 1. Block SIGALRM during critical regions, not sure if the necessary atomic signal mask capabilities are there in KVM. Haven't looked at TCG yet either. 2. Make a portion of the timer code signal-safe and rearm the timer from within the SIGLARM handler. Stefan