Re: [Qemu-devel] the whole virtual machine hangs when IO does not come back!

Stefan Hajnoczi Mon, 08 Sep 2014 01:37:07 -0700

On Tue, Aug 12, 2014 at 09:10:26AM +0800, Bin Wu wrote:
> On 2014/8/11 22:21, Stefan Hajnoczi wrote:
> >On Mon, Aug 11, 2014 at 04:33:21PM +0800, Bin Wu wrote:
> >>Hi,
> >>
> >>I tested the reliability of qemu in the IPSAN environment as follows:
> >>(1) create one VM on a X86 server which is connected to an IPSAN, and the VM
> >>has only one system volume which is on the IPSAN;
> >>(2) disconnect the network between the server and the IPSAN. On the server,
> >>I have a "multipath" software which can hold the IO for a long time
> >>(configurable) when the network is disconnected;
> >>(3) about 30 seconds later, the whole VM hangs there, nothing can be done to
> >>the VM!
> >>
> >>Then, I used "gstack" tool to collect the stacks of all qemu threads, it
> >>looked like:
> >>
> >>Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
> >>#0  0x00007fd84253a4f6 in poll () from /lib64/libc.so.6
> >>#1  0x00007fd84410ceff in aio_poll ()
> >>#2  0x00007fd84429bb05 in qemu_aio_wait ()
> >>#3  0x00007fd844120f51 in bdrv_drain_all ()
> >>#4  0x00007fd8441f1a4a in bmdma_cmd_writeb ()
> >>#5  0x00007fd8441f216e in bmdma_write ()
> >>#6  0x00007fd8443a93cf in memory_region_write_accessor ()
> >>#7  0x00007fd8443a94a6 in access_with_adjusted_size ()
> >>#8  0x00007fd8443a9901 in memory_region_iorange_write ()
> >>#9  0x00007fd8443a19bd in ioport_writeb_thunk ()
> >>#10 0x00007fd8443a13a8 in ioport_write ()
> >>#11 0x00007fd8443a1f55 in cpu_outb ()
> >>#12 0x00007fd8443a5b12 in kvm_handle_io ()
> >>#13 0x00007fd8443a64a9 in kvm_cpu_exec ()
> >>#14 0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
> >>#15 0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
> >>#16 0x00007fd8425439cd in clone () from /lib64/libc.so.6
> >>#17 0x0000000000000000 in ?? ()
> >Use virtio-blk.  Read, write, and flush are asynchronous in virtio-blk.
> >
> >Note that the QEMU monitor commands are typically synchronous so they
> >will still block the VM.
> >
> >Stefan
> Thank you for your attention. I tested virtio-blk and it's true that the VM
> doesn't hange.
> Why  does the virtio-blk implement this in asynchronous way, but virtio-scsi
> in synchronous
> way?


There is no fundamental reason why virtio-scsi should be synchronous,
it's just that QEMU internally has some points (such as bdrv_drain_all()
that Fam mentioned) that wait synchronously.

Since the SCSI cancel code path hit bdrv_drain_all(), the guest hung.
virtio-blk doesn't have a "cancel" operation and therefore doesn't hang.

Stefan

pgpjy02o1fPGD.pgp
Description: PGP signature

Re: [Qemu-devel] the whole virtual machine hangs when IO does not come back!

Reply via email to