On Tue, Aug 12, 2014 at 09:10:26AM +0800, Bin Wu wrote: > On 2014/8/11 22:21, Stefan Hajnoczi wrote: > >On Mon, Aug 11, 2014 at 04:33:21PM +0800, Bin Wu wrote: > >>Hi, > >> > >>I tested the reliability of qemu in the IPSAN environment as follows: > >>(1) create one VM on a X86 server which is connected to an IPSAN, and the VM > >>has only one system volume which is on the IPSAN; > >>(2) disconnect the network between the server and the IPSAN. On the server, > >>I have a "multipath" software which can hold the IO for a long time > >>(configurable) when the network is disconnected; > >>(3) about 30 seconds later, the whole VM hangs there, nothing can be done to > >>the VM! > >> > >>Then, I used "gstack" tool to collect the stacks of all qemu threads, it > >>looked like: > >> > >>Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)): > >>#0 0x00007fd84253a4f6 in poll () from /lib64/libc.so.6 > >>#1 0x00007fd84410ceff in aio_poll () > >>#2 0x00007fd84429bb05 in qemu_aio_wait () > >>#3 0x00007fd844120f51 in bdrv_drain_all () > >>#4 0x00007fd8441f1a4a in bmdma_cmd_writeb () > >>#5 0x00007fd8441f216e in bmdma_write () > >>#6 0x00007fd8443a93cf in memory_region_write_accessor () > >>#7 0x00007fd8443a94a6 in access_with_adjusted_size () > >>#8 0x00007fd8443a9901 in memory_region_iorange_write () > >>#9 0x00007fd8443a19bd in ioport_writeb_thunk () > >>#10 0x00007fd8443a13a8 in ioport_write () > >>#11 0x00007fd8443a1f55 in cpu_outb () > >>#12 0x00007fd8443a5b12 in kvm_handle_io () > >>#13 0x00007fd8443a64a9 in kvm_cpu_exec () > >>#14 0x00007fd844330962 in qemu_kvm_cpu_thread_fn () > >>#15 0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 > >>#16 0x00007fd8425439cd in clone () from /lib64/libc.so.6 > >>#17 0x0000000000000000 in ?? () > >Use virtio-blk. Read, write, and flush are asynchronous in virtio-blk. > > > >Note that the QEMU monitor commands are typically synchronous so they > >will still block the VM. > > > >Stefan > Thank you for your attention. I tested virtio-blk and it's true that the VM > doesn't hange. > Why does the virtio-blk implement this in asynchronous way, but virtio-scsi > in synchronous > way?
There is no fundamental reason why virtio-scsi should be synchronous, it's just that QEMU internally has some points (such as bdrv_drain_all() that Fam mentioned) that wait synchronously. Since the SCSI cancel code path hit bdrv_drain_all(), the guest hung. virtio-blk doesn't have a "cancel" operation and therefore doesn't hang. Stefan
pgpjy02o1fPGD.pgp
Description: PGP signature
