[Qemu-devel] the whole virtual machine hangs when IO does not come back!
Hi, I tested the reliability of qemu in the IPSAN environment as follows: (1) create one VM on a X86 server which is connected to an IPSAN, and the VM has only one system volume which is on the IPSAN; (2) disconnect the network between the server and the IPSAN. On the server, I have a "multipath" software which can hold the IO for a long time (configurable) when the network is disconnected; (3) about 30 seconds later, the whole VM hangs there, nothing can be done to the VM! Then, I used "gstack" tool to collect the stacks of all qemu threads, it looked like: Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)): #0 0x7fd84253a4f6 in poll () from /lib64/libc.so.6 #1 0x7fd84410ceff in aio_poll () #2 0x7fd84429bb05 in qemu_aio_wait () #3 0x7fd844120f51 in bdrv_drain_all () #4 0x7fd8441f1a4a in bmdma_cmd_writeb () #5 0x7fd8441f216e in bmdma_write () #6 0x7fd8443a93cf in memory_region_write_accessor () #7 0x7fd8443a94a6 in access_with_adjusted_size () #8 0x7fd8443a9901 in memory_region_iorange_write () #9 0x7fd8443a19bd in ioport_writeb_thunk () #10 0x7fd8443a13a8 in ioport_write () #11 0x7fd8443a1f55 in cpu_outb () #12 0x7fd8443a5b12 in kvm_handle_io () #13 0x7fd8443a64a9 in kvm_cpu_exec () #14 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #15 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #16 0x7fd8425439cd in clone () from /lib64/libc.so.6 #17 0x in ?? () Thread 7 (Thread 0x7fd8403b4700 (LWP 6672)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 6 (Thread 0x7fd83fbb3700 (LWP 6673)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 5 (Thread 0x7fd83f3b2700 (LWP 6674)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 4 (Thread 0x7fd83ebb1700 (LWP 6675)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 3 (Thread 0x7fd83e3b0700 (LWP 6676)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 2 (Thread 0x7fd23b7ff700 (LWP 6679)): #0 0x7fd8427eb61c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x7fd8444528f0 in qemu_cond_wait () #2 0x7fd844312d9d in vnc_worker_thread_loop () #3 0x7fd844313315 in vnc_worker_thread () #4 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #5 0x7fd8425439cd in clone () from /lib64/libc.so.6 #6 0x i
[Qemu-devel] the whole virtual machine hangs when IO does not come back!
Hi, I tested the reliability of qemu in the IPSAN environment as follows: (1) create one VM on a X86 server which is connected to an IPSAN, and the VM has only one system volume which is on the IPSAN; (2) disconnect the network between the server and the IPSAN. On the server, I have a "multipath" software which can hold the IO for a long time (configurable) when the network is disconnected; (3) about 30 seconds later, the whole VM hangs there, nothing can be done to the VM! Then, I used "gstack" tool to collect the stacks of all qemu threads, it looked like: Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)): #0 0x7fd84253a4f6 in poll () from /lib64/libc.so.6 #1 0x7fd84410ceff in aio_poll () #2 0x7fd84429bb05 in qemu_aio_wait () #3 0x7fd844120f51 in bdrv_drain_all () #4 0x7fd8441f1a4a in bmdma_cmd_writeb () #5 0x7fd8441f216e in bmdma_write () #6 0x7fd8443a93cf in memory_region_write_accessor () #7 0x7fd8443a94a6 in access_with_adjusted_size () #8 0x7fd8443a9901 in memory_region_iorange_write () #9 0x7fd8443a19bd in ioport_writeb_thunk () #10 0x7fd8443a13a8 in ioport_write () #11 0x7fd8443a1f55 in cpu_outb () #12 0x7fd8443a5b12 in kvm_handle_io () #13 0x7fd8443a64a9 in kvm_cpu_exec () #14 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #15 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #16 0x7fd8425439cd in clone () from /lib64/libc.so.6 #17 0x in ?? () Thread 7 (Thread 0x7fd8403b4700 (LWP 6672)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 6 (Thread 0x7fd83fbb3700 (LWP 6673)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 5 (Thread 0x7fd83f3b2700 (LWP 6674)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 4 (Thread 0x7fd83ebb1700 (LWP 6675)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 3 (Thread 0x7fd83e3b0700 (LWP 6676)): #0 0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0 #2 0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x7fd8444526bd in qemu_mutex_lock () #4 0x7fd844330f47 in qemu_mutex_lock_iothread () #5 0x7fd8443a63b9 in kvm_cpu_exec () #6 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #7 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #8 0x7fd8425439cd in clone () from /lib64/libc.so.6 #9 0x in ?? () Thread 2 (Thread 0x7fd23b7ff700 (LWP 6679)): #0 0x7fd8427eb61c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x7fd8444528f0 in qemu_cond_wait () #2 0x7fd844312d9d in vnc_worker_thread_loop () #3 0x7fd844313315 in vnc_worker_thread () #4 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #5 0x7fd8425439cd in clone () from /lib64/libc.so.6 #6 0x i
Re: [Qemu-devel] the whole virtual machine hangs when IO does not come back!
On 2014/8/11 22:21, Stefan Hajnoczi wrote: On Mon, Aug 11, 2014 at 04:33:21PM +0800, Bin Wu wrote: Hi, I tested the reliability of qemu in the IPSAN environment as follows: (1) create one VM on a X86 server which is connected to an IPSAN, and the VM has only one system volume which is on the IPSAN; (2) disconnect the network between the server and the IPSAN. On the server, I have a "multipath" software which can hold the IO for a long time (configurable) when the network is disconnected; (3) about 30 seconds later, the whole VM hangs there, nothing can be done to the VM! Then, I used "gstack" tool to collect the stacks of all qemu threads, it looked like: Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)): #0 0x7fd84253a4f6 in poll () from /lib64/libc.so.6 #1 0x7fd84410ceff in aio_poll () #2 0x7fd84429bb05 in qemu_aio_wait () #3 0x7fd844120f51 in bdrv_drain_all () #4 0x7fd8441f1a4a in bmdma_cmd_writeb () #5 0x7fd8441f216e in bmdma_write () #6 0x7fd8443a93cf in memory_region_write_accessor () #7 0x7fd8443a94a6 in access_with_adjusted_size () #8 0x7fd8443a9901 in memory_region_iorange_write () #9 0x7fd8443a19bd in ioport_writeb_thunk () #10 0x7fd8443a13a8 in ioport_write () #11 0x7fd8443a1f55 in cpu_outb () #12 0x7fd8443a5b12 in kvm_handle_io () #13 0x7fd8443a64a9 in kvm_cpu_exec () #14 0x7fd844330962 in qemu_kvm_cpu_thread_fn () #15 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 #16 0x7fd8425439cd in clone () from /lib64/libc.so.6 #17 0x in ?? () Use virtio-blk. Read, write, and flush are asynchronous in virtio-blk. Note that the QEMU monitor commands are typically synchronous so they will still block the VM. Stefan Thank you for your attention. I tested virtio-blk and it's true that the VM doesn't hange. Why does the virtio-blk implement this in asynchronous way, but virtio-scsi in synchronous way?
[Qemu-devel] Does the event idx mechanism in virtio work in the correct way?
Hi, The event idx in virtio is an effective way to reduce the number of interrupts and exits of the guest. When the guest puts an request into the virtio ring, it doesn't exit immediately to inform the backend. Instead, the guest checks the "avail" event idx to determine the notification. For example, assume that the guest already puts five requests in the ring and the backend gets two of these requests, then the "avail" event idx should be two. Now the guest puts another request into the ring. It checks the "avail" event idx and finds the event idx (equals to 2) is less than the number of available requests (equals to 6), so it will not notify the backend to work. The backend works in a loop to get the request from the ring until the ring is empty or all requests have been taken. However, I find some code doesn't work in the way I thought: int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) { ... i = head = virtqueue_get_head(vq, vq->last_avail_idx++); if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { vring_avail_event(vq, vring_avail_idx(vq)); } ... } In the sentence "vring_avail_event(vq, vring_avail_idx(vq));", I think the "avail" event idx should equal to the number of requests have been taken(vq->last_avail_idx), not the number of all available requests (vring_avail_idx(vq)). Is there any special consideration or do I just misunderstand the event idx? thanks Bin Wu
[Qemu-devel] Does the event idx mechanism in virtio work in the correct way?
Hi, The event idx in virtio is an effective way to reduce the number of interrupts and exits of the guest. When the guest puts an request into the virtio ring, it doesn't exit immediately to inform the backend. Instead, the guest checks the "avail" event idx to determine the notification. For example, assume that the guest already puts five requests in the ring and the backend gets two of these requests, then the "avail" event idx should be two. Now the guest puts another request into the ring. It checks the "avail" event idx and finds the event idx (equals to 2) is less than the number of available requests (equals to 6), so it will not notify the backend to work. The backend works in a loop to get the request from the ring until the ring is empty or all requests have been taken. However, I find some code doesn't work in the way I thought: int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) { ... i = head = virtqueue_get_head(vq, vq->last_avail_idx++); if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { vring_avail_event(vq, vring_avail_idx(vq)); } ... } In the statement"vring_avail_event(vq, vring_avail_idx(vq));", I think the "avail" event idx should equal to the number of requests have been taken(vq->last_avail_idx), not the number of all available requests (vring_avail_idx(vq)). Is there any special consideration or do I just misunderstand the event idx? thanks -- Bin Wu
[Qemu-devel] [PATCH] hw/scsi/virtio-scsi.c: fix the "type" use error in virtio_scsi_handle_ctrl
The local variable "type" in virtio_scsi_handle_ctl represents the tmf command type from the guest and it has the same meaning as the req->req.tmf.type. However, before the invoking of virtio_scsi_parse_req the req->req.tmf.type doesn't has the correct value(just initialized to zero). Therefore, we need to use the "type" variable to judge the case. Signed-off-by: Bin Wu --- hw/scsi/virtio-scsi.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index a1725b8..5742d39 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -378,8 +378,8 @@ void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req) return; } -virtio_tswap32s(vdev, &req->req.tmf.type); -if (req->req.tmf.type == VIRTIO_SCSI_T_TMF) { +virtio_tswap32s(vdev, &type); +if (type == VIRTIO_SCSI_T_TMF) { if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlTMFReq), sizeof(VirtIOSCSICtrlTMFResp)) < 0) { virtio_scsi_bad_req(); @@ -387,8 +387,8 @@ void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req) r = virtio_scsi_do_tmf(s, req); } -} else if (req->req.tmf.type == VIRTIO_SCSI_T_AN_QUERY || -req->req.tmf.type == VIRTIO_SCSI_T_AN_SUBSCRIBE) { +} else if (type == VIRTIO_SCSI_T_AN_QUERY || +type == VIRTIO_SCSI_T_AN_SUBSCRIBE) { if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlANReq), sizeof(VirtIOSCSICtrlANResp)) < 0) { virtio_scsi_bad_req(); -- 1.7.12.4
[Qemu-devel] [PATCH] hw/virtio/virtio.c: fix the vring_avail_event error
The event idx in virtio is an effective way to reduce the number of interrupts and exits of the guest. When the guest puts an request into the virtio ring, it doesn't exit immediately to inform the backend. Instead, the guest checks the "avail" event idx to determine the notification. In virtqueue_pop, when a request is poped, the current avail event idx should be set to the number of vq->last_avail_idx. Signed-off-by: Bin Wu --- hw/virtio/virtio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 2c236bf..013979a 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) i = head = virtqueue_get_head(vq, vq->last_avail_idx++); if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { -vring_avail_event(vq, vring_avail_idx(vq)); +vring_avail_event(vq, vq->last_avail_idx); } if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) { -- 1.7.12.4
Re: [Qemu-devel] [PATCH] hw/virtio/virtio.c: fix the vring_avail_event error
On 2014/10/28 13:32, Michael S. Tsirkin wrote: > On Tue, Oct 28, 2014 at 02:13:02AM +0000, Bin Wu wrote: >> The event idx in virtio is an effective way to reduce the number of >> interrupts and exits of the guest. When the guest puts an request >> into the virtio ring, it doesn't exit immediately to inform the >> backend. Instead, the guest checks the "avail" event idx to determine >> the notification. >> >> In virtqueue_pop, when a request is poped, the current avail event >> idx should be set to the number of vq->last_avail_idx. >> >> Signed-off-by: Bin Wu > > Does this fix some observable bug? Improve efficiency for some workload? > Improve efficiency, I test as follows: (1)create a suse11sp3 VM with a virtio-blk test disk; (2)use libaio library to submit 32 IOs concurrently (invoke io_sumbmit 32 times) in the VM; (3)modify the virtio-blk driver in the guest to record the host notification times; without this patch, virtio-blk may notify the host between 10 and 30 times; with this patch, virtio-blk just notify the host once during every test:) >> --- >> hw/virtio/virtio.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >> index 2c236bf..013979a 100644 >> --- a/hw/virtio/virtio.c >> +++ b/hw/virtio/virtio.c >> @@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) >> >> i = head = virtqueue_get_head(vq, vq->last_avail_idx++); >> if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { >> -vring_avail_event(vq, vring_avail_idx(vq)); >> +vring_avail_event(vq, vq->last_avail_idx); >> } >> >> if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) { >> -- >> 1.7.12.4 > > . > -- Bin Wu
[Qemu-devel] [PATCH v2] hw/virtio/vring/event_idx: fix the vring_avail_event error
The event idx in virtio is an effective way to reduce the number of interrupts and exits of the guest. When the guest puts an request into the virtio ring, it doesn't exit immediately to inform the backend. Instead, the guest checks the "avail" event idx to determine the notification. In virtqueue_pop, when a request is poped, the current avail event idx should be set to the number of vq->last_avail_idx. Signed-off-by: Bin Wu --- V2 -> V1: update the same code in hw/virtio/dataplane/vring.c (Stefan) --- hw/virtio/dataplane/vring.c | 8 hw/virtio/virtio.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c index 372706a..61f6d83 100644 --- a/hw/virtio/dataplane/vring.c +++ b/hw/virtio/dataplane/vring.c @@ -352,10 +352,6 @@ int vring_pop(VirtIODevice *vdev, Vring *vring, goto out; } -if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { -vring_avail_event(&vring->vr) = vring->vr.avail->idx; -} - i = head; do { if (unlikely(i >= num)) { @@ -392,6 +388,10 @@ int vring_pop(VirtIODevice *vdev, Vring *vring, /* On success, increment avail index. */ vring->last_avail_idx++; +if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { +vring_avail_event(&vring->vr) = vring->last_avail_idx; +} + return head; out: diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 2c236bf..013979a 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) i = head = virtqueue_get_head(vq, vq->last_avail_idx++); if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { -vring_avail_event(vq, vring_avail_idx(vq)); +vring_avail_event(vq, vq->last_avail_idx); } if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) { -- 1.7.12.4
Re: [Qemu-devel] [PATCH] hw/virtio/virtio.c: fix the vring_avail_event error
On 2014/10/31 0:48, Stefan Hajnoczi wrote: > On Tue, Oct 28, 2014 at 02:13:02AM +0000, Bin Wu wrote: >> The event idx in virtio is an effective way to reduce the number of >> interrupts and exits of the guest. When the guest puts an request >> into the virtio ring, it doesn't exit immediately to inform the >> backend. Instead, the guest checks the "avail" event idx to determine >> the notification. >> >> In virtqueue_pop, when a request is poped, the current avail event >> idx should be set to the number of vq->last_avail_idx. >> >> Signed-off-by: Bin Wu >> --- >> hw/virtio/virtio.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >> index 2c236bf..013979a 100644 >> --- a/hw/virtio/virtio.c >> +++ b/hw/virtio/virtio.c >> @@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) >> >> i = head = virtqueue_get_head(vq, vq->last_avail_idx++); >> if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { >> -vring_avail_event(vq, vring_avail_idx(vq)); >> +vring_avail_event(vq, vq->last_avail_idx); >> } > > hw/virtio/dataplane/vring.c: > > if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) { > vring_avail_event(&vring->vr) = vring->vr.avail->idx; > } > > This is equivalent to the virtio.c code you are replacing, so it should > be updated too. > > Stefan > Thanks, stefan. I'll send another patch later. -- Bin Wu
Re: [Qemu-devel] [PATCH v5 00/22] block: Asynchronous request cancellation
On 2014/9/10 13:59, Fam Zheng wrote: v5: Fix IDE callback. (Paolo) Fix blkdebug. (Paolo) Drop the DMA fix which is independent of this series. (Paolo) Incorperate Yuan's patch on quorum_aio_cancel. (Benoît) Commit message wording fix. (Benoît) Rename qemu_aio_release to qemu_aio_unref. (Benoît) v4: Drop AIOCBInfo.cancel. This series adds a new block layer API: void bdrv_aio_cancel_async(BlockDriverAIOCB *acb); And use it to emulate bdrv_aio_cancel. The function is similar to bdrv_aio_cancel in that it cancels an AIO request, but different that it doesn't block until the request is completely cancelled or done. More importantly, the completion callback, BlockDriverAIOCB.cb, is guaranteed to be called, so that the cb can take care of resource releasing and status reporting to guest, etc. In the following work, scsi emulation code will be shifted to use the async cancelling. One major benefit would be that when guest tries to cancel a request, where the request cannot be cancelled easily, (due to throttled BlockDriverState, a lost connection, or a large request queue), we don't need to block the whole vm with a busy loop, which is how bdrv_aio_cancel is implemented now. First, I think this series is really useful. However, I tested the v4 series and found virtio-scsi disk(scsi-hd) was still blocked when the IO could not come back because of virtio_scsi_do_tmf->scsi_cancel_io->bdrv_aio_cancel. can we just change the bdrv_aio_cancel to bdrv_aio_cancel_async to solve this problem? A test case that is easy to reproduce is, throttle a scsi-disk to a very low limit, for example 50 bps, then stress the guest block device with dd or fio. Currently, the vm will quickly hang when it loses patience and send a tmf command to cancel the request, at which point we will busy wait in bdrv_aio_cancel, until the request is slowly spit out from throttled_reqs. Later, we will change scsi device code to make this asynchronous, on top of bdrv_aio_cancel_async. Fam Fam Zheng (21): ide/ahci: Check for -ECANCELED in aio callbacks block: Add refcnt in BlockDriverAIOCB block: Add bdrv_aio_cancel_async block: Drop bdrv_em_co_aiocb_info.cancel block: Convert bdrv_em_aiocb_info.cancel to .cancel_async thread-pool: Convert thread_pool_aiocb_info.cancel to cancel_async linux-aio: Convert laio_aiocb_info.cancel to .cancel_async dma: Convert dma_aiocb_info.cancel to .cancel_async iscsi: Convert iscsi_aiocb_info.cancel to .cancel_async archipelago: Drop archipelago_aiocb_info.cancel blkdebug: Drop blkdebug_aiocb_info.cancel blkverify: Drop blkverify_aiocb_info.cancel curl: Drop curl_aiocb_info.cancel qed: Drop qed_aiocb_info.cancel quorum: Convert quorum_aiocb_info.cancel to .cancel_async rbd: Drop rbd_aiocb_info.cancel sheepdog: Convert sd_aiocb_info.cancel to .cancel_async win32-aio: Drop win32_aiocb_info.cancel ide: Convert trim_aiocb_info.cancel to .cancel_async block: Drop AIOCBInfo.cancel block: Rename qemu_aio_release -> qemu_aio_unref Liu Yuan (1): quorum: fix quorum_aio_cancel() block.c | 69 block/archipelago.c | 19 ++--- block/blkdebug.c | 17 ++-- block/blkverify.c| 21 +-- block/curl.c | 16 --- block/iscsi.c| 23 block/linux-aio.c| 34 +++- block/qed.c | 23 +--- block/quorum.c | 11 block/rbd.c | 25 ++ block/sheepdog.c | 54 - block/win32-aio.c| 18 ++--- dma-helpers.c| 20 +++--- hw/ide/ahci.c| 3 +++ hw/ide/core.c| 26 -- include/block/aio.h | 7 +++-- include/block/block.h| 1 + tests/test-thread-pool.c | 34 ++-- thread-pool.c| 36 +++-- 19 files changed, 172 insertions(+), 285 deletions(-)
[Qemu-devel] [PATCH] qemu-coroutine: segfault when restarting co_queue
From: Bin Wu We tested VMs migration with their disk images by drive_mirror. With migration, two VMs copyed large files between each other. During the test, a segfault occured. The stack was as follow: (gdb) bt #0 0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at qemu-coroutine-lock.c:66 #1 0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160, to=0x7fa5a1798648) at qemu-coroutine.c:97 #2 0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140 #3 0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0, request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at block/nbd-client.c:165 #4 0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0, sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at block/nbd-client.c:262 #5 0x7fa5a0c9e9dd in nbd_client_session_co_writev (client=0x7fa5a1a3cfd0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd-client.c:296 #6 0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 #7 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0, req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 #8 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 #9 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 #10 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 #11 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 #12 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110, req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 #13 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 #14 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 #15 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at block.c:4706 #16 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at coroutine-ucontext.c:121 #17 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from /lib64/libc.so.6 #18 0x in ?? () After analyzing the stack and reviewing the code, we find the qemu_co_queue_run_restart should not be put in the coroutine_swap function which can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only qemu_coroutine_enter needs to restart the co_queue. The error scenario is given in the following patch log. Bin Wu (1): qemu-coroutine: fix qemu_co_queue_run_restart error qemu-coroutine.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) -- 1.7.12.4
[Qemu-devel] [PATCH] qemu-coroutine: fix qemu_co_queue_run_restart error
From: Bin Wu The error scenario is as follow: coroutine C1 enters C2, C2 yields back to C1, then C1 ternimates and the related coroutine memory becomes invalid. After a while, the C2 coroutine is entered again. At this point, C1 is used as a parameter passed to qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart accesses an invalid memory and a segfault error ocurrs. The qemu_co_queue_run_restart function re-enters coroutines waiting in the co_queue. However, this function should be only used int the qemu_coroutine_enter context. Only in this context, when the current coroutine gets execution control again(after the execution of qemu_coroutine_switch), we can restart the target coutine because the target coutine has yielded back to the current coroutine or it has terminated. First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, but we find we can not access the target coroutine if it terminates. Signed-off-by: Bin Wu --- qemu-coroutine.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/qemu-coroutine.c b/qemu-coroutine.c index 525247b..9a294c4 100644 --- a/qemu-coroutine.c +++ b/qemu-coroutine.c @@ -99,24 +99,25 @@ static void coroutine_delete(Coroutine *co) qemu_coroutine_delete(co); } -static void coroutine_swap(Coroutine *from, Coroutine *to) +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to) { CoroutineAction ret; ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD); -qemu_co_queue_run_restart(to); - switch (ret) { case COROUTINE_YIELD: -return; +break; case COROUTINE_TERMINATE: trace_qemu_coroutine_terminate(to); +qemu_co_queue_run_restart(to); coroutine_delete(to); -return; +break; default: abort(); } + +return ret; } void qemu_coroutine_enter(Coroutine *co, void *opaque) @@ -133,6 +134,8 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque) co->caller = self; co->entry_arg = opaque; coroutine_swap(self, co); +if (ret == COROUTINE_YIELD) +qemu_co_queue_run_restart(co); } void coroutine_fn qemu_coroutine_yield(void) -- 1.7.12.4
Re: [Qemu-devel] [PATCH] qemu-coroutine: fix qemu_co_queue_run_restart error
sorry, there is a mistake in this patch: the "ret" variable is not defined :< I will send a new patch to fix this problem. On 2015/2/9 12:09, Bin Wu wrote: > From: Bin Wu > > The error scenario is as follow: coroutine C1 enters C2, C2 yields > back to C1, then C1 ternimates and the related coroutine memory > becomes invalid. After a while, the C2 coroutine is entered again. > At this point, C1 is used as a parameter passed to > qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart > accesses an invalid memory and a segfault error ocurrs. > > The qemu_co_queue_run_restart function re-enters coroutines waiting > in the co_queue. However, this function should be only used int the > qemu_coroutine_enter context. Only in this context, when the current > coroutine gets execution control again(after the execution of > qemu_coroutine_switch), we can restart the target coutine because the > target coutine has yielded back to the current coroutine or it has > terminated. > > First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, > but we find we can not access the target coroutine if it terminates. > > Signed-off-by: Bin Wu > --- > qemu-coroutine.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/qemu-coroutine.c b/qemu-coroutine.c > index 525247b..9a294c4 100644 > --- a/qemu-coroutine.c > +++ b/qemu-coroutine.c > @@ -99,24 +99,25 @@ static void coroutine_delete(Coroutine *co) > qemu_coroutine_delete(co); > } > > -static void coroutine_swap(Coroutine *from, Coroutine *to) > +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to) > { > CoroutineAction ret; > > ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD); > > -qemu_co_queue_run_restart(to); > - > switch (ret) { > case COROUTINE_YIELD: > -return; > +break; > case COROUTINE_TERMINATE: > trace_qemu_coroutine_terminate(to); > +qemu_co_queue_run_restart(to); > coroutine_delete(to); > -return; > +break; > default: > abort(); > } > + > +return ret; > } > > void qemu_coroutine_enter(Coroutine *co, void *opaque) > @@ -133,6 +134,8 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque) > co->caller = self; > co->entry_arg = opaque; > coroutine_swap(self, co); > +if (ret == COROUTINE_YIELD) > +qemu_co_queue_run_restart(co); > } > > void coroutine_fn qemu_coroutine_yield(void) > -- Bin Wu
[Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue
From: Bin Wu We tested VMs migration with their disk images by drive_mirror. With migration, two VMs copyed large files between each other. During the test, a segfault occured. The stack was as follow: (gdb) bt qemu-coroutine-lock.c:66 to=0x7fa5a1798648) at qemu-coroutine.c:97 request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at block/nbd-client.c:165 sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at block/nbd-client.c:262 sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd-client.c:296 nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 coroutine-ucontext.c:121 After analyzing the stack and reviewing the code, we find the qemu_co_queue_run_restart should not be put in the coroutine_swap function which can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only qemu_coroutine_enter needs to restart the co_queue. The error scenario is as follow: coroutine C1 enters C2, C2 yields back to C1, then C1 ternimates and the related coroutine memory becomes invalid. After a while, the C2 coroutine is entered again. At this point, C1 is used as a parameter passed to qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart accesses an invalid memory and a segfault error ocurrs. The qemu_co_queue_run_restart function re-enters coroutines waiting in the co_queue. However, this function should be only used int the qemu_coroutine_enter context. Only in this context, when the current coroutine gets execution control again(after the execution of qemu_coroutine_switch), we can restart the target coutine because the target coutine has yielded back to the current coroutine or it has terminated. First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, but we find we can not access the target coroutine if it terminates. Signed-off-by: Bin Wu --- qemu-coroutine.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/qemu-coroutine.c b/qemu-coroutine.c index 525247b..cc0bdfa 100644 --- a/qemu-coroutine.c +++ b/qemu-coroutine.c @@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co) qemu_coroutine_delete(co); } -static void coroutine_swap(Coroutine *from, Coroutine *to) +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to) { CoroutineAction ret; ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD); -qemu_co_queue_run_restart(to); - switch (ret) { case COROUTINE_YIELD: -return; +break; case COROUTINE_TERMINATE: trace_qemu_coroutine_terminate(to); +qemu_co_queue_run_restart(to); coroutine_delete(to); -return; +break; default: abort(); } + +return ret; } void qemu_coroutine_enter(Coroutine *co, void *opaque) { Coroutine *self = qemu_coroutine_self(); +CoroutineAction ret; trace_qemu_coroutine_enter(self, co, opaque); @@ -132,7 +134,9 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque) co->caller = self; co->entry_arg = opaque; -coroutine_swap(self, co); +ret = coroutine_swap(self, co); +if (ret == COROUTINE_YIELD) +qemu_co_queue_run_restart(co); } void coroutine_fn qemu_coroutine_yield(void) -- 1.7.12.4
Re: [Qemu-devel] [PATCH] qemu-coroutine-lock: fix co_queue multi-adding bug
On 2015/2/9 16:12, Fam Zheng wrote: > On Sat, 02/07 17:51, w00214312 wrote: >> From: Bin Wu >> >> When a coroutine holds a lock, other coroutines who want to get >> the lock must wait on a co_queue by adding themselves to the >> CoQueue. However, if a waiting coroutine is woken up with the >> lock still be holding by other coroutine, this waiting coroutine > > Could you explain who wakes up the waiting coroutine? Maybe the bug is that > it shouldn't be awaken in the first place. > During the mirror phase with nbd devices, if we send a cancel command or physical network breaks down, the source qemu process will receive a readable event and the main loop will invoke nbd_reply_ready to deal with it. This function finds the connection is down and then goes into nbd_teardown_connection. nbd_teardown_connection wakes up all working coroutines by nbd_recv_coroutines_enter_all. These coroutines include the one which holds the sending lock, the ones which wait for the lock, and the ones which wait for receiving messages. I think the purpose of nbd_recv_coroutines_enter_all is to terminate all waiting coroutines by waking all of them up. If the coroutine waiting for the lock is allowed for waking up, this implementation is ok. If not, we need to distinguish the coroutines waiting for receiving messages from the ones waiting for the lock. In my option, if the coroutines waiting for a lock is allowd for waking up, it should be more robust :> >> will add itself to the co_queue again. Latter, when the lock >> is released, a coroutine re-enter will occur. >> >> We need to determine whether a coroutine is alread in the co_queue > > s/alread/already/ > > Fam > Thanks, my mistake. >> before adding it to the waiting queue. >> >> Signed-off-by: Bin Wu >> --- >> include/block/coroutine_int.h | 1 + >> qemu-coroutine-lock.c | 6 +- >> qemu-coroutine.c | 1 + >> 3 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/include/block/coroutine_int.h b/include/block/coroutine_int.h >> index f133d65..c524990 100644 >> --- a/include/block/coroutine_int.h >> +++ b/include/block/coroutine_int.h >> @@ -42,6 +42,7 @@ struct Coroutine { >> /* Coroutines that should be woken up when we yield or terminate */ >> QTAILQ_HEAD(, Coroutine) co_queue_wakeup; >> QTAILQ_ENTRY(Coroutine) co_queue_next; >> +bool in_co_queue; >> }; >> >> Coroutine *qemu_coroutine_new(void); >> diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c >> index e4860ae..d256f53 100644 >> --- a/qemu-coroutine-lock.c >> +++ b/qemu-coroutine-lock.c >> @@ -36,7 +36,10 @@ void qemu_co_queue_init(CoQueue *queue) >> void coroutine_fn qemu_co_queue_wait(CoQueue *queue) >> { >> Coroutine *self = qemu_coroutine_self(); >> -QTAILQ_INSERT_TAIL(&queue->entries, self, co_queue_next); >> +if (!self->in_co_queue) { >> +QTAILQ_INSERT_TAIL(&queue->entries, self, co_queue_next); >> +self->in_co_queue = true; >> +} >> qemu_coroutine_yield(); >> assert(qemu_in_coroutine()); >> } >> @@ -71,6 +74,7 @@ static bool qemu_co_queue_do_restart(CoQueue *queue, bool >> single) >> >> while ((next = QTAILQ_FIRST(&queue->entries)) != NULL) { >> QTAILQ_REMOVE(&queue->entries, next, co_queue_next); >> +next->in_co_queue = false; >> QTAILQ_INSERT_TAIL(&self->co_queue_wakeup, next, co_queue_next); >> trace_qemu_co_queue_next(next); >> if (single) { >> diff --git a/qemu-coroutine.c b/qemu-coroutine.c >> index 525247b..a103721 100644 >> --- a/qemu-coroutine.c >> +++ b/qemu-coroutine.c >> @@ -75,6 +75,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry) >> } >> >> co->entry = entry; >> +co->in_co_queue = false; >> QTAILQ_INIT(&co->co_queue_wakeup); >> return co; >> } >> -- >> 1.7.12.4 >> >> > > . > -- Bin Wu
Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug
On 2015/2/9 17:23, Paolo Bonzini wrote: > > > On 07/02/2015 10:51, w00214312 wrote: >> From: Bin Wu >> >> When we test the drive_mirror between different hosts by ndb devices, >> we find that, during the cancel phase the qemu process crashes sometimes. >> By checking the crash core file, we find the stack as follows, which means >> a coroutine re-enter error occurs: > > This bug probably can be fixed simply by delaying the setting of > recv_coroutine. > > What are the symptoms if you only apply your "qemu-coroutine-lock: fix > co_queue multi-adding bug" patch but not "qemu-coroutine: fix > qemu_co_queue_run_restart error"? > > Can you try the patch below? (Compile-tested only). > yes, I think this patch can solve the problem too. I will try the patch latter. > diff --git a/block/nbd-client.c b/block/nbd-client.c > index 6e1c97c..23d6a71 100644 > --- a/block/nbd-client.c > +++ b/block/nbd-client.c > @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s, > QEMUIOVector *qiov, int offset) > { > AioContext *aio_context; > -int rc, ret; > +int rc, ret, i; > > qemu_co_mutex_lock(&s->send_mutex); > + > +for (i = 0; i < MAX_NBD_REQUESTS; i++) { > +if (s->recv_coroutine[i] == NULL) { > +s->recv_coroutine[i] = qemu_coroutine_self(); > +break; > +} > +} > + > +assert(i < MAX_NBD_REQUESTS); > +request->handle = INDEX_TO_HANDLE(s, i); > s->send_coroutine = qemu_coroutine_self(); > + > aio_context = bdrv_get_aio_context(s->bs); > aio_set_fd_handler(aio_context, s->sock, > nbd_reply_ready, nbd_restart_write, s); > @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s, > static void nbd_coroutine_start(NbdClientSession *s, > struct nbd_request *request) > { > -int i; > - > /* Poor man semaphore. The free_sema is locked when no other request > * can be accepted, and unlocked after receiving one reply. */ > if (s->in_flight >= MAX_NBD_REQUESTS - 1) { > @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s, > } > s->in_flight++; > > -for (i = 0; i < MAX_NBD_REQUESTS; i++) { > -if (s->recv_coroutine[i] == NULL) { > -s->recv_coroutine[i] = qemu_coroutine_self(); > - break; > -} > -} > - > -assert(i < MAX_NBD_REQUESTS); > -request->handle = INDEX_TO_HANDLE(s, i); > +/* s->recv_coroutine[i] is set as soon as we get the send_lock. */ > } > > static void nbd_coroutine_end(NbdClientSession *s, > > > -- Bin Wu
Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue
On 2015/2/9 22:48, Stefan Hajnoczi wrote: > On Mon, Feb 09, 2015 at 02:50:39PM +0800, Bin Wu wrote: >> From: Bin Wu >> >> We tested VMs migration with their disk images by drive_mirror. With >> migration, two VMs copyed large files between each other. During the >> test, a segfault occured. The stack was as follow: >> >> (gdb) bt >> qemu-coroutine-lock.c:66 >> to=0x7fa5a1798648) at qemu-coroutine.c:97 >> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at >> block/nbd-client.c:165 >> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at >> block/nbd-client.c:262 >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at >> block/nbd-client.c:296 >> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 >> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >> flags=0) at block.c:3321 >> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at >> block.c:3447 >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >> 0)) at >> block.c:3471 >> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 >> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 >> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >> flags=0) at block.c:3321 >> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at >> block.c:3447 >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >> 0)) at >> block.c:3471 >> coroutine-ucontext.c:121 > > This backtrace is incomplete. Where are the function names? The > parameter lists appear incomplete too. > I put the stack in the git log, so some lines are missed: (gdb) bt #0 0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at qemu-coroutine-lock.c:66 #1 0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160, to=0x7fa5a1798648) at qemu-coroutine.c:97 #2 0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140 #3 0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0, request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at block/nbd-client.c:165 #4 0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0, sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at block/nbd-client.c:262 #5 0x7fa5a0c9e9dd in nbd_client_session_co_writev (client=0x7fa5a1a3cfd0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd-client.c:296 #6 0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 #7 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0, req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 #8 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 #9 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 #10 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 #11 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 #12 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110, req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 #13 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 #14 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 #15 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at block.c:4706 #16 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at coroutine-ucontext.c:121 #17 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from /lib64/libc.so.6 #18 0x in ?? () >> After analyzing the stack and reviewing the code, we find the >> qemu_co_queue_run_restart should not be put in the coroutine_swap function >> which >> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only >> qemu_coroutine_enter needs to restart the co_queue. >> >> The error scenario is as follow: coroutine C1 enters C2, C2 yields >> back to C1, then C1 ternimates and the related coroutine memory >> becomes invalid. After a while, the C2 coroutine is entered again. >> At this point, C1 is used as a parameter passed to >> qemu_co_queue_run_restart. Therefore, qemu
Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue
On 2015/2/9 17:09, Paolo Bonzini wrote: > > > On 09/02/2015 07:50, Bin Wu wrote: >> From: Bin Wu >> >> We tested VMs migration with their disk images by drive_mirror. With >> migration, two VMs copyed large files between each other. During the >> test, a segfault occured. The stack was as follow: >> >> (gdb) bt >> qemu-coroutine-lock.c:66 >> to=0x7fa5a1798648) at qemu-coroutine.c:97 >> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at >> block/nbd-client.c:165 >> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at >> block/nbd-client.c:262 >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at >> block/nbd-client.c:296 >> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 >> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >> flags=0) at block.c:3321 >> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at >> block.c:3447 >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >> 0)) at >> block.c:3471 >> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 >> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 >> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >> flags=0) at block.c:3321 >> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at >> block.c:3447 >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >> 0)) at >> block.c:3471 >> coroutine-ucontext.c:121 >> >> After analyzing the stack and reviewing the code, we find the >> qemu_co_queue_run_restart should not be put in the coroutine_swap function >> which >> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only >> qemu_coroutine_enter needs to restart the co_queue. >> >> The error scenario is as follow: coroutine C1 enters C2, C2 yields >> back to C1, then C1 ternimates and the related coroutine memory >> becomes invalid. After a while, the C2 coroutine is entered again. >> At this point, C1 is used as a parameter passed to >> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart >> accesses an invalid memory and a segfault error ocurrs. >> >> The qemu_co_queue_run_restart function re-enters coroutines waiting >> in the co_queue. However, this function should be only used int the >> qemu_coroutine_enter context. Only in this context, when the current >> coroutine gets execution control again(after the execution of >> qemu_coroutine_switch), we can restart the target coutine because the >> target coutine has yielded back to the current coroutine or it has >> terminated. > > qemu_coroutine_yield can be executed for other reasons than locks. In > those cases, it is correct to call qemu_co_queue_run_restart. I think > it's an NBD bug. > > Paolo > Maybe I didn't describe the error scenario clearly, but it's a normal coroutine using case, not a NBD bug. Please reference stefan's reply, thanks. >> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, >> but we find we can not access the target coroutine if it terminates. >> >> Signed-off-by: Bin Wu >> --- >> qemu-coroutine.c | 16 ++-- >> 1 file changed, 10 insertions(+), 6 deletions(-) >> >> diff --git a/qemu-coroutine.c b/qemu-coroutine.c >> index 525247b..cc0bdfa 100644 >> --- a/qemu-coroutine.c >> +++ b/qemu-coroutine.c >> @@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co) >> qemu_coroutine_delete(co); >> } >> >> -static void coroutine_swap(Coroutine *from, Coroutine *to) >> +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to) >> { >> CoroutineAction ret; >> >> ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD); >> >> -qemu_co_queue_run_restart(to); >> - >> switch (ret) { >> case COROUTINE_YIELD: >> -return; >> +break; >> case COROUTINE_TERMINATE: >> trace_qemu_coroutine_terminate(to); >> +qemu_co_queue_run_restart(to); >> coroutine_delete(to); >> -return; >> +break; >> default: >> abort(); >> } >> + >> + return ret; >> } >> >> void qemu_coroutine_enter(Coroutine *co, void *opaque) >> { >> Coroutine *self = qemu_coroutine_self(); >> +CoroutineAction ret; >> >> trace_qemu_coroutine_enter(self, co, opaque); >> >> @@ -132,7 +134,9 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque) >> >> co->caller = self; >> co->entry_arg = opaque; >> -coroutine_swap(self, co); >> +ret = coroutine_swap(self, co); >> +if (ret == COROUTINE_YIELD) >> +qemu_co_queue_run_restart(co); >> } >> >> void coroutine_fn qemu_coroutine_yield(void) >> > > > . > -- Bin Wu
Re: [Qemu-devel] [PATCH] qemu-coroutine-lock: fix co_queue multi-adding bug
On 2015/2/9 18:12, Kevin Wolf wrote: > Am 09.02.2015 um 10:36 hat Bin Wu geschrieben: >> On 2015/2/9 16:12, Fam Zheng wrote: >>> On Sat, 02/07 17:51, w00214312 wrote: >>>> From: Bin Wu >>>> >>>> When a coroutine holds a lock, other coroutines who want to get >>>> the lock must wait on a co_queue by adding themselves to the >>>> CoQueue. However, if a waiting coroutine is woken up with the >>>> lock still be holding by other coroutine, this waiting coroutine >>> >>> Could you explain who wakes up the waiting coroutine? Maybe the bug is that >>> it shouldn't be awaken in the first place. >>> >> >> During the mirror phase with nbd devices, if we send a cancel command or >> physical network breaks down, the source qemu process will receive a readable >> event and the main loop will invoke nbd_reply_ready to deal with it. This >> function finds the connection is down and then goes into >> nbd_teardown_connection. nbd_teardown_connection wakes up all working >> coroutines >> by nbd_recv_coroutines_enter_all. These coroutines include the one which >> holds >> the sending lock, the ones which wait for the lock, and the ones which wait >> for >> receiving messages. > > This is the bug. It's not allowed to reenter a coroutine if you don't > know its state. NBD needs a fix, not the the generic coroutine > infrastructure. > > If we want to change anything in the lock implementation, it should be > adding an assertion to catch such violations of the rule. (Untested, but > I think the assertion should hold true.) > > Kevin > > diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c > index e4860ae..25fc111 100644 > --- a/qemu-coroutine-lock.c > +++ b/qemu-coroutine-lock.c > @@ -123,9 +123,8 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex) > > trace_qemu_co_mutex_lock_entry(mutex, self); > > -while (mutex->locked) { > -qemu_co_queue_wait(&mutex->queue); > - } > +qemu_co_queue_wait(&mutex->queue); > +assert(!mutex->locked); > > mutex->locked = true; > > . > I see. paolo's patch in his first reply can fix the bug in NBD (not in coroutine). I will complete that patch to do some tests and send another version latter. -- Bin Wu
Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue
On 2015/2/10 11:16, Wen Congyang wrote: > On 02/09/2015 10:48 PM, Stefan Hajnoczi wrote: >> On Mon, Feb 09, 2015 at 02:50:39PM +0800, Bin Wu wrote: >>> From: Bin Wu >>> >>> We tested VMs migration with their disk images by drive_mirror. With >>> migration, two VMs copyed large files between each other. During the >>> test, a segfault occured. The stack was as follow: >>> >>> (gdb) bt >>> qemu-coroutine-lock.c:66 >>> to=0x7fa5a1798648) at qemu-coroutine.c:97 >>> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at >>> block/nbd-client.c:165 >>> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at >>> block/nbd-client.c:262 >>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at >>> block/nbd-client.c:296 >>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 >>> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >>> flags=0) at block.c:3321 >>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) >>> at >>> block.c:3447 >>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >>> 0)) at >>> block.c:3471 >>> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 >>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 >>> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >>> flags=0) at block.c:3321 >>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) >>> at >>> block.c:3447 >>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >>> 0)) at >>> block.c:3471 >>> coroutine-ucontext.c:121 >> >> This backtrace is incomplete. Where are the function names? The >> parameter lists appear incomplete too. >> >>> After analyzing the stack and reviewing the code, we find the >>> qemu_co_queue_run_restart should not be put in the coroutine_swap function >>> which >>> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only >>> qemu_coroutine_enter needs to restart the co_queue. >>> >>> The error scenario is as follow: coroutine C1 enters C2, C2 yields >>> back to C1, then C1 ternimates and the related coroutine memory >>> becomes invalid. After a while, the C2 coroutine is entered again. >>> At this point, C1 is used as a parameter passed to >>> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart >>> accesses an invalid memory and a segfault error ocurrs. >>> >>> The qemu_co_queue_run_restart function re-enters coroutines waiting >>> in the co_queue. However, this function should be only used int the >>> qemu_coroutine_enter context. Only in this context, when the current >>> coroutine gets execution control again(after the execution of >>> qemu_coroutine_switch), we can restart the target coutine because the >>> target coutine has yielded back to the current coroutine or it has >>> terminated. >>> >>> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, >>> but we find we can not access the target coroutine if it terminates. >> >> This example captures the scenario you describe: >> >> diff --git a/qemu-coroutine.c b/qemu-coroutine.c >> index 525247b..883cbf5 100644 >> --- a/qemu-coroutine.c >> +++ b/qemu-coroutine.c >> @@ -103,7 +103,10 @@ static void coroutine_swap(Coroutine *from, Coroutine >> *to) >> { >> CoroutineAction ret; >> >> +fprintf(stderr, "> %s from %p to %p\n", __func__, from, to); >> ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD); >> +fprintf(stderr, "< %s from %p to %p switch %s\n", __func__, from, to, >> +ret == COROUTINE_YIELD ? "yield" : "terminate"); >> >> qemu_co_queue_run_restart(to); >> >> @@ -111,6 +114,7 @@ static void coroutine_swap(Coroutine *from, Coroutine >> *to) >> case COROUTINE_YIELD: >> return; >> case COROUTINE_TERMINATE: >> +fprintf(stderr, "coroutine_delete %p\n", to); >> trace_qemu_coroutine_terminate(to); >> coroutine_delete(to); >> return; >> diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c >> index 27d1b6f..d44c428 100644 >> --- a/tests/test-coroutine.c >> +++ b/tests/test-coroutine.c >> @@ -13,6 +13,7 @@ >> >&g
[Qemu-devel] [PATCH v3] qemu-coroutine: segfault when restarting co_queue
From: Bin Wu We tested VMs migration with their disk images by drive_mirror. With migration, two VMs copyed large files between each other. During the test, a segfault occured. The stack was as follow: 00) 0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at qemu-coroutine-lock.c:66 01) 0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160, to=0x7fa5a1798648) at qemu-coroutine.c:97 02) 0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140 03) 0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0, request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at block/nbd-client.c:165 04) 0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0, sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at block/nbd-client.c:262 05) 0x7fa5a0c9e9dd in nbd_client_session_co_writev (client=0x7fa5a1a3cfd0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd-client.c:296 06) 0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 07) 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0, req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 08) 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 09) 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 10) 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 11) 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 12) 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110, req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=0) at block.c:3321 13) 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3447 14) 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110, sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at block.c:3471 15) 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at block.c:4706 16) 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at coroutine-ucontext.c:121 17) 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from /lib64/libc.so.6 18) 0x in ?? () After analyzing the stack and reviewing the code, we find the qemu_co_queue_run_restart should not be put in the coroutine_swap function which can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only qemu_coroutine_enter needs to restart the co_queue. The error scenario is as follow: coroutine C1 enters C2, C2 yields back to C1, then C1 ternimates and the related coroutine memory becomes invalid. After a while, the C2 coroutine is entered again. At this point, C1 is used as a parameter passed to qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart accesses an invalid memory and a segfault error ocurrs. The qemu_co_queue_run_restart function re-enters coroutines waiting in the co_queue. However, this function should be only used int the qemu_coroutine_enter context. Only in this context, when the current coroutine gets execution control again(after the execution of qemu_coroutine_switch), we can restart the target coutine because the target coutine has yielded back to the current coroutine or it has terminated. First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, but we find we can not access the target coroutine if it terminates. Signed-off-by: Bin Wu --- v3: add a test case in coroutine test code provided by stefan. v2: fix undefined variable. --- qemu-coroutine.c | 17 +++-- tests/test-coroutine.c | 31 +++ 2 files changed, 42 insertions(+), 6 deletions(-) diff --git a/qemu-coroutine.c b/qemu-coroutine.c index 525247b..c8e280d 100644 --- a/qemu-coroutine.c +++ b/qemu-coroutine.c @@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co) qemu_coroutine_delete(co); } -static void coroutine_swap(Coroutine *from, Coroutine *to) +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to) { CoroutineAction ret; ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD); -qemu_co_queue_run_restart(to); - switch (ret) { case COROUTINE_YIELD: -return; +break; case COROUTINE_TERMINATE: trace_qemu_coroutine_terminate(to); +qemu_co_queue_run_restart(to); coroutine_delete(to); -return; +break; default: abort(); } + +return ret; } void qemu_coroutine_enter(Coroutine *co, void *opaque
Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug
On 2015/2/9 17:23, Paolo Bonzini wrote: > > > On 07/02/2015 10:51, w00214312 wrote: >> From: Bin Wu >> >> When we test the drive_mirror between different hosts by ndb devices, >> we find that, during the cancel phase the qemu process crashes sometimes. >> By checking the crash core file, we find the stack as follows, which means >> a coroutine re-enter error occurs: > > This bug probably can be fixed simply by delaying the setting of > recv_coroutine. > > What are the symptoms if you only apply your "qemu-coroutine-lock: fix > co_queue multi-adding bug" patch but not "qemu-coroutine: fix > qemu_co_queue_run_restart error"? These two patches are used to solve two different problems: -"qemu-coroutine-lock: fix co_queue multi-adding bug" solves the coroutine re-enter problem which is found when we send a cancel command after the drive_mirror is just started. -"qemu-coroutine: fix qemu_co_queue_run_restart error" solves the segfault problem during drive_mirror phase of two VMs which copy large files between each other. > > Can you try the patch below? (Compile-tested only). > > diff --git a/block/nbd-client.c b/block/nbd-client.c > index 6e1c97c..23d6a71 100644 > --- a/block/nbd-client.c > +++ b/block/nbd-client.c > @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s, > QEMUIOVector *qiov, int offset) > { > AioContext *aio_context; > -int rc, ret; > +int rc, ret, i; > > qemu_co_mutex_lock(&s->send_mutex); > + > +for (i = 0; i < MAX_NBD_REQUESTS; i++) { > +if (s->recv_coroutine[i] == NULL) { > +s->recv_coroutine[i] = qemu_coroutine_self(); > +break; > +} > +} > + > +assert(i < MAX_NBD_REQUESTS); > +request->handle = INDEX_TO_HANDLE(s, i); > s->send_coroutine = qemu_coroutine_self(); > + > aio_context = bdrv_get_aio_context(s->bs); > aio_set_fd_handler(aio_context, s->sock, > nbd_reply_ready, nbd_restart_write, s); > @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s, > static void nbd_coroutine_start(NbdClientSession *s, > struct nbd_request *request) > { > -int i; > - > /* Poor man semaphore. The free_sema is locked when no other request > * can be accepted, and unlocked after receiving one reply. */ > if (s->in_flight >= MAX_NBD_REQUESTS - 1) { > @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s, > } > s->in_flight++; > > -for (i = 0; i < MAX_NBD_REQUESTS; i++) { > -if (s->recv_coroutine[i] == NULL) { > -s->recv_coroutine[i] = qemu_coroutine_self(); > -break; > -} > -} > - > -assert(i < MAX_NBD_REQUESTS); > -request->handle = INDEX_TO_HANDLE(s, i); > +/* s->recv_coroutine[i] is set as soon as we get the send_lock. */ > } > > static void nbd_coroutine_end(NbdClientSession *s, > > > -- Bin Wu
[Qemu-devel] [PATCH v2] nbd: fix the co_queue multi-adding bug
From: Bin Wu When we tested the VM migartion between different hosts with NBD devices, we found if we sent a cancel command after the drive_mirror was just started, a coroutine re-enter error would occur. The stack was as follow: (gdb) bt 00) 0x7fdfc744d885 in raise () from /lib64/libc.so.6 01) 0x7fdfc744ee61 in abort () from /lib64/libc.so.6 02) 0x7fdfca467cc5 in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0) at qemu-coroutine.c:118 03) 0x7fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedb400) at qemu-coroutine-lock.c:59 04) 0x7fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8, to=0x7fdfcaedb400) at qemu-coroutine.c:96 05) 0x7fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0) at qemu-coroutine.c:123 06) 0x7fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedbdc0) at qemu-coroutine-lock.c:59 07) 0x7fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8, to=0x7fdfcaedbdc0) at qemu-coroutine.c:96 08) 0x7fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedbdc0, opaque=0x0) at qemu-coroutine.c:123 09) 0x7fdfca4a1fa4 in nbd_recv_coroutines_enter_all (s=0x7fdfcaef7dd0) at block/nbd-client.c:41 10) 0x7fdfca4a1ff9 in nbd_teardown_connection (client=0x7fdfcaef7dd0) at block/nbd-client.c:50 11) 0x7fdfca4a20f0 in nbd_reply_ready (opaque=0x7fdfcaef7dd0) at block/nbd-client.c:92 12) 0x7fdfca45ed80 in aio_dispatch (ctx=0x7fdfcae15e90) at aio-posix.c:144 13) 0x7fdfca45ef1b in aio_poll (ctx=0x7fdfcae15e90, blocking=false) at aio-posix.c:222 14) 0x7fdfca448c34 in aio_ctx_dispatch (source=0x7fdfcae15e90, callback=0x0, user_data=0x0) at async.c:212 15) 0x7fdfc8f2f69a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 16) 0x7fdfca45c391 in glib_pollfds_poll () at main-loop.c:190 17) 0x7fdfca45c489 in os_host_main_loop_wait (timeout=1483677098) at main-loop.c:235 18) 0x7fdfca45c57b in main_loop_wait (nonblocking=0) at main-loop.c:484 19) 0x7fdfca25f403 in main_loop () at vl.c:2249 20) 0x7fdfca266fc2 in main (argc=42, argv=0x7517d638, envp=0x7517d790) at vl.c:4814 We find the nbd_recv_coroutines_enter_all function (triggered by a cancel command or a network connection breaking down) will enter a coroutine which is waiting for the sending lock. If the lock is still held by another coroutine, the entering coroutine will be added into the co_queue again. Latter, when the lock is released, a coroutine re-enter error will occur. This bug can be fixed simply by delaying the setting of recv_coroutine as suggested by paolo. After applying this patch, we have tested the cancel operation in mirror phase looply for more than 5 hous and everything is fine. Without this patch, a coroutine re-enter error will occur in 5 minutes. Signed-off-by: Bn Wu --- v2: fix the coroutine re-enter bug in NBD code, not in coroutine infrastructure as suggested by paolo and kevin. --- block/nbd-client.c | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 28bfb62..1b0f7c7 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s, QEMUIOVector *qiov, int offset) { AioContext *aio_context; -int rc, ret; +int rc, ret, i; qemu_co_mutex_lock(&s->send_mutex); + +for (i = 0; i < MAX_NBD_REQUESTS; i++) { +if (s->recv_coroutine[i] == NULL) { +s->recv_coroutine[i] = qemu_coroutine_self(); +break; +} +} + +assert(i < MAX_NBD_REQUESTS); +request->handle = INDEX_TO_HANDLE(s, i); s->send_coroutine = qemu_coroutine_self(); + aio_context = bdrv_get_aio_context(s->bs); aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, nbd_restart_write, s); @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s, static void nbd_coroutine_start(NbdClientSession *s, struct nbd_request *request) { -int i; - /* Poor man semaphore. The free_sema is locked when no other request * can be accepted, and unlocked after receiving one reply. */ if (s->in_flight >= MAX_NBD_REQUESTS - 1) { @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s, } s->in_flight++; -for (i = 0; i < MAX_NBD_REQUESTS; i++) { -if (s->recv_coroutine[i] == NULL) { -s->recv_coroutine[i] = qemu_coroutine_self(); -break; -} -} - -assert(i < MAX_NBD_REQUESTS); -request->handle = INDEX_TO_HANDLE(s, i); +/* s->recv_coroutine[i] is set as soon as we get the send_lock. */ } static void nbd_coroutine_end(NbdClientSession *s, -- 1.7.12.4
Re: [Qemu-devel] [PATCH v3] qemu-coroutine: segfault when restarting co_queue
On 2015/2/10 18:32, Kevin Wolf wrote: > Am 10.02.2015 um 06:16 hat Bin Wu geschrieben: >> From: Bin Wu >> >> We tested VMs migration with their disk images by drive_mirror. With >> migration, two VMs copyed large files between each other. During the >> test, a segfault occured. The stack was as follow: >> >> 00) 0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at >> qemu-coroutine-lock.c:66 >> 01) 0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160, >> to=0x7fa5a1798648) at qemu-coroutine.c:97 >> 02) 0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140 >> 03) 0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0, >> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at >> block/nbd-client.c:165 >> 04) 0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0, >> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at >> block/nbd-client.c:262 >> 05) 0x7fa5a0c9e9dd in nbd_client_session_co_writev >> (client=0x7fa5a1a3cfd0, >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at >> block/nbd-client.c:296 >> 06) 0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, >> sector_num=8552704, >> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291 >> 07) 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0, >> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >> flags=0) at block.c:3321 >> 08) 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0, >> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at >> block.c:3447 >> 09) 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0, >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >> 0)) at >> block.c:3471 >> 10) 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, >> sector_num=8552704, >> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480 >> 11) 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, >> sector_num=8552704, >> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62 >> 12) 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110, >> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, >> flags=0) at block.c:3321 >> 13) 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110, >> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at >> block.c:3447 >> 14) 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110, >> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: >> 0)) at >> block.c:3471 >> 15) 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at >> block.c:4706 >> 16) 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at >> coroutine-ucontext.c:121 >> 17) 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from >> /lib64/libc.so.6 >> 18) 0x in ?? () >> >> After analyzing the stack and reviewing the code, we find the >> qemu_co_queue_run_restart should not be put in the coroutine_swap function >> which >> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only >> qemu_coroutine_enter needs to restart the co_queue. >> >> The error scenario is as follow: coroutine C1 enters C2, C2 yields >> back to C1, then C1 ternimates and the related coroutine memory >> becomes invalid. After a while, the C2 coroutine is entered again. >> At this point, C1 is used as a parameter passed to >> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart >> accesses an invalid memory and a segfault error ocurrs. >> >> The qemu_co_queue_run_restart function re-enters coroutines waiting >> in the co_queue. However, this function should be only used int the >> qemu_coroutine_enter context. Only in this context, when the current >> coroutine gets execution control again(after the execution of >> qemu_coroutine_switch), we can restart the target coutine because the >> target coutine has yielded back to the current coroutine or it has >> terminated. >> >> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter, >> but we find we can not access the target coroutine if it terminates. >> >> Signed-off-by: Bin Wu > > As I replied for v2, I'll send a series with a simpler fix and some > cleanup that should result in a nicer design in the end. > > Kevin > > . > all right. -- Bin Wu
[Qemu-devel] [PATCH] mirror: hold aio_context before bdrv_drain
From: Bin Wu Signed-off-by: Bin Wu --- block/mirror.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/mirror.c b/block/mirror.c index 4056164..08372df 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -530,7 +530,9 @@ static void coroutine_fn mirror_run(void *opaque) * mirror_populate runs. */ trace_mirror_before_drain(s, cnt); +aio_context_acquire(bdrv_get_aio_context(bs)); bdrv_drain(bs); +aio_context_release(bdrv_get_aio_context(bs)); cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap); } -- 1.7.12.4
Re: [Qemu-devel] [PATCH] mirror: hold aio_context before bdrv_drain
On 2015/4/1 16:19, Fam Zheng wrote: > On Wed, 04/01 12:42, Bin Wu wrote: >> From: Bin Wu > > What's the issue are you fixing? I think the coroutine already is running in > the AioContext of bs. > > Fam > In the current implementation of bdrv_drain, it should be placed in a critical section as suggested in the comments above the function: "Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState AioContext". However, the mirror coroutine starting with mirror_run doesn't do this. I just found qmp_drive_mirror protects the AioCentext, but it is out of the scope of the mirror coroutine. Bin >> >> Signed-off-by: Bin Wu >> --- >> block/mirror.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/block/mirror.c b/block/mirror.c >> index 4056164..08372df 100644 >> --- a/block/mirror.c >> +++ b/block/mirror.c >> @@ -530,7 +530,9 @@ static void coroutine_fn mirror_run(void *opaque) >> * mirror_populate runs. >> */ >> trace_mirror_before_drain(s, cnt); >> +aio_context_acquire(bdrv_get_aio_context(bs)); >> bdrv_drain(bs); >> +aio_context_release(bdrv_get_aio_context(bs)); >> cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap); >> } >> >> -- >> 1.7.12.4 >> >> >> > > . >
Re: [Qemu-devel] [Qemu-block] [PATCH] mirror: hold aio_context before bdrv_drain
On 2015/4/1 19:59, Stefan Hajnoczi wrote: > On Wed, Apr 01, 2015 at 04:49:39PM +0800, Bin Wu wrote: >> >> On 2015/4/1 16:19, Fam Zheng wrote: >>> On Wed, 04/01 12:42, Bin Wu wrote: >>>> From: Bin Wu >>> >>> What's the issue are you fixing? I think the coroutine already is running in >>> the AioContext of bs. >>> >>> Fam >>> >> In the current implementation of bdrv_drain, it should be placed in a >> critical >> section as suggested in the comments above the function: "Note that unlike >> bdrv_drain_all(), the caller must hold the BlockDriverState AioContext". >> >> However, the mirror coroutine starting with mirror_run doesn't do this. I >> just >> found qmp_drive_mirror protects the AioCentext, but it is out of the scope of >> the mirror coroutine. > > There are 3 possibilities: > > 1. qmp_drive_mirror() under QEMU main loop thread. AioContext is held. > > 2. IOThread aio_poll(). AioContext is held. > > 3. QEMU main loop thread when IOThread (dataplane) is not used. Here >the AioContext is the global qemu_aio_context. We don't need to >acquire it explicitly, we're protected by the global mutex. > > If #3 was a problem then virtio-blk.c's bdrv_aio_writev() would also be > a problem, for example. > > This patch is unnecessary. > > Stefan > OK, I see. Thanks -- Bin Wu
Re: [Qemu-devel] [PATCH] virtio-blk: correctly dirty guest memory
++) { -qemu_iovec_add(qiov, tmp_iov[i].iov_base, tmp_iov[i].iov_len); -} +/* use the 'in' field to judge whether the request is + a merged request */ +merged_request->in = NULL; + +qiov = &merged_request->qiov; +qemu_iovec_init(qiov, niov); -for (i = start + 1; i < start + num_reqs; i++) { +for (i = start; i < start + num_reqs; i++) { qemu_iovec_concat(qiov, &mrb->reqs[i]->qiov, 0, mrb->reqs[i]->qiov.size); -mrb->reqs[i - 1]->mr_next = mrb->reqs[i]; +if (i > start) { +mrb->reqs[i - 1]->mr_next = mrb->reqs[i]; +} else { +merged_request->mr_next = mrb->reqs[i]; +} + nb_sectors += mrb->reqs[i]->qiov.size / BDRV_SECTOR_SIZE; } assert(nb_sectors == qiov->size / BDRV_SECTOR_SIZE); @@ -345,14 +348,18 @@ static inline void submit_requests(BlockBackend *blk, MultiReqBuffer *mrb, block_acct_merge_done(blk_get_stats(blk), is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ, num_reqs - 1); +} else { +merged_request = mrb->reqs[start]; +qiov = &mrb->reqs[start]->qiov; +nb_sectors = mrb->reqs[start]->qiov.size / BDRV_SECTOR_SIZE; } if (is_write) { blk_aio_writev(blk, sector_num, qiov, nb_sectors, - virtio_blk_rw_complete, mrb->reqs[start]); + virtio_blk_rw_complete, merged_request); } else { blk_aio_readv(blk, sector_num, qiov, nb_sectors, - virtio_blk_rw_complete, mrb->reqs[start]); + virtio_blk_rw_complete, merged_request); } } -- Bin Wu
[Qemu-devel] [question] block: do we have any consideration about adding retry support in error handling?
Hi, When IO error happens in physical device, qemu block layer supports error reporting, error ignoring and error stoping(for example, virtio-blk). Can we have any way to resend the error IO? thanks -- Bin Wu
Re: [Qemu-devel] [question] block: do we have any consideration about adding retry support in error handling?
On 2014/12/25 10:42, Fam Zheng wrote: > On Thu, 12/25 09:57, Bin Wu wrote: >> Hi, >> >> When IO error happens in physical device, qemu block layer supports error >> reporting, error ignoring and error stoping(for example, virtio-blk). Can we >> have any way to resend the error IO? > > With error stop, the request is retried after resume. > > Fam > > Thank you very much, Fam, I see. Another question: I think error stop is not the default error handling strategy, how can we configure error stop in the VM XML file? Can you just show me some example? Thanks again:> -- Bin Wu
Re: [Qemu-devel] [question] block: do we have any consideration about adding retry support in error handling?
On 2014/12/25 15:19, Fam Zheng wrote: > On Thu, 12/25 11:46, Bin Wu wrote: >> On 2014/12/25 10:42, Fam Zheng wrote: >>> On Thu, 12/25 09:57, Bin Wu wrote: >>>> Hi, >>>> >>>> When IO error happens in physical device, qemu block layer supports error >>>> reporting, error ignoring and error stoping(for example, virtio-blk). Can >>>> we >>>> have any way to resend the error IO? >>> >>> With error stop, the request is retried after resume. >>> >>> Fam >>> >>> >> >> Thank you very much, Fam, I see. Another question: I think error stop is not >> the >> default error handling strategy, how can we configure error stop in the VM >> XML >> file? Can you just show me some example? Thanks again:> > > This is a question for libvirt, look for "error_policy": > > https://libvirt.org/formatdomain.html > > Fam > > yes, ... ... -- Bin Wu