[Qemu-devel] the whole virtual machine hangs when IO does not come back!

2014-08-11 Thread Bin Wu

Hi,

I tested the reliability of qemu in the IPSAN environment as follows:
(1) create one VM on a X86 server which is connected to an IPSAN, and 
the VM has only one system volume which is on the IPSAN;
(2) disconnect the network between the server and the IPSAN. On the 
server, I have a "multipath" software which can hold the IO for a long 
time (configurable) when the network is disconnected;
(3) about 30 seconds later, the whole VM hangs there, nothing can be 
done to the VM!


Then, I used "gstack" tool to collect the stacks of all qemu threads, it 
looked like:


Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
#0  0x7fd84253a4f6 in poll () from /lib64/libc.so.6
#1  0x7fd84410ceff in aio_poll ()
#2  0x7fd84429bb05 in qemu_aio_wait ()
#3  0x7fd844120f51 in bdrv_drain_all ()
#4  0x7fd8441f1a4a in bmdma_cmd_writeb ()
#5  0x7fd8441f216e in bmdma_write ()
#6  0x7fd8443a93cf in memory_region_write_accessor ()
#7  0x7fd8443a94a6 in access_with_adjusted_size ()
#8  0x7fd8443a9901 in memory_region_iorange_write ()
#9  0x7fd8443a19bd in ioport_writeb_thunk ()
#10 0x7fd8443a13a8 in ioport_write ()
#11 0x7fd8443a1f55 in cpu_outb ()
#12 0x7fd8443a5b12 in kvm_handle_io ()
#13 0x7fd8443a64a9 in kvm_cpu_exec ()
#14 0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#15 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#16 0x7fd8425439cd in clone () from /lib64/libc.so.6
#17 0x in ?? ()

Thread 7 (Thread 0x7fd8403b4700 (LWP 6672)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 6 (Thread 0x7fd83fbb3700 (LWP 6673)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 5 (Thread 0x7fd83f3b2700 (LWP 6674)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 4 (Thread 0x7fd83ebb1700 (LWP 6675)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 3 (Thread 0x7fd83e3b0700 (LWP 6676)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 2 (Thread 0x7fd23b7ff700 (LWP 6679)):
#0  0x7fd8427eb61c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

#1  0x7fd8444528f0 in qemu_cond_wait ()
#2  0x7fd844312d9d in vnc_worker_thread_loop ()
#3  0x7fd844313315 in vnc_worker_thread ()
#4  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#5  0x7fd8425439cd in clone () from /lib64/libc.so.6
#6  0x i

[Qemu-devel] the whole virtual machine hangs when IO does not come back!

2014-08-11 Thread Bin Wu

Hi,

I tested the reliability of qemu in the IPSAN environment as follows:
(1) create one VM on a X86 server which is connected to an IPSAN, and 
the VM has only one system volume which is on the IPSAN;
(2) disconnect the network between the server and the IPSAN. On the 
server, I have a "multipath" software which can hold the IO for a long 
time (configurable) when the network is disconnected;
(3) about 30 seconds later, the whole VM hangs there, nothing can be 
done to the VM!


Then, I used "gstack" tool to collect the stacks of all qemu threads, it 
looked like:


Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
#0  0x7fd84253a4f6 in poll () from /lib64/libc.so.6
#1  0x7fd84410ceff in aio_poll ()
#2  0x7fd84429bb05 in qemu_aio_wait ()
#3  0x7fd844120f51 in bdrv_drain_all ()
#4  0x7fd8441f1a4a in bmdma_cmd_writeb ()
#5  0x7fd8441f216e in bmdma_write ()
#6  0x7fd8443a93cf in memory_region_write_accessor ()
#7  0x7fd8443a94a6 in access_with_adjusted_size ()
#8  0x7fd8443a9901 in memory_region_iorange_write ()
#9  0x7fd8443a19bd in ioport_writeb_thunk ()
#10 0x7fd8443a13a8 in ioport_write ()
#11 0x7fd8443a1f55 in cpu_outb ()
#12 0x7fd8443a5b12 in kvm_handle_io ()
#13 0x7fd8443a64a9 in kvm_cpu_exec ()
#14 0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#15 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#16 0x7fd8425439cd in clone () from /lib64/libc.so.6
#17 0x in ?? ()

Thread 7 (Thread 0x7fd8403b4700 (LWP 6672)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 6 (Thread 0x7fd83fbb3700 (LWP 6673)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 5 (Thread 0x7fd83f3b2700 (LWP 6674)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 4 (Thread 0x7fd83ebb1700 (LWP 6675)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 3 (Thread 0x7fd83e3b0700 (LWP 6676)):
#0  0x7fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x7fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x7fd8444526bd in qemu_mutex_lock ()
#4  0x7fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x7fd8443a63b9 in kvm_cpu_exec ()
#6  0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x7fd8425439cd in clone () from /lib64/libc.so.6
#9  0x in ?? ()

Thread 2 (Thread 0x7fd23b7ff700 (LWP 6679)):
#0  0x7fd8427eb61c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

#1  0x7fd8444528f0 in qemu_cond_wait ()
#2  0x7fd844312d9d in vnc_worker_thread_loop ()
#3  0x7fd844313315 in vnc_worker_thread ()
#4  0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#5  0x7fd8425439cd in clone () from /lib64/libc.so.6
#6  0x i

Re: [Qemu-devel] the whole virtual machine hangs when IO does not come back!

2014-08-11 Thread Bin Wu


On 2014/8/11 22:21, Stefan Hajnoczi wrote:

On Mon, Aug 11, 2014 at 04:33:21PM +0800, Bin Wu wrote:

Hi,

I tested the reliability of qemu in the IPSAN environment as follows:
(1) create one VM on a X86 server which is connected to an IPSAN, and the VM
has only one system volume which is on the IPSAN;
(2) disconnect the network between the server and the IPSAN. On the server,
I have a "multipath" software which can hold the IO for a long time
(configurable) when the network is disconnected;
(3) about 30 seconds later, the whole VM hangs there, nothing can be done to
the VM!

Then, I used "gstack" tool to collect the stacks of all qemu threads, it
looked like:

Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
#0  0x7fd84253a4f6 in poll () from /lib64/libc.so.6
#1  0x7fd84410ceff in aio_poll ()
#2  0x7fd84429bb05 in qemu_aio_wait ()
#3  0x7fd844120f51 in bdrv_drain_all ()
#4  0x7fd8441f1a4a in bmdma_cmd_writeb ()
#5  0x7fd8441f216e in bmdma_write ()
#6  0x7fd8443a93cf in memory_region_write_accessor ()
#7  0x7fd8443a94a6 in access_with_adjusted_size ()
#8  0x7fd8443a9901 in memory_region_iorange_write ()
#9  0x7fd8443a19bd in ioport_writeb_thunk ()
#10 0x7fd8443a13a8 in ioport_write ()
#11 0x7fd8443a1f55 in cpu_outb ()
#12 0x7fd8443a5b12 in kvm_handle_io ()
#13 0x7fd8443a64a9 in kvm_cpu_exec ()
#14 0x7fd844330962 in qemu_kvm_cpu_thread_fn ()
#15 0x7fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#16 0x7fd8425439cd in clone () from /lib64/libc.so.6
#17 0x in ?? ()

Use virtio-blk.  Read, write, and flush are asynchronous in virtio-blk.

Note that the QEMU monitor commands are typically synchronous so they
will still block the VM.

Stefan
Thank you for your attention. I tested virtio-blk and it's true that the 
VM doesn't hange.
Why  does the virtio-blk implement this in asynchronous way, but 
virtio-scsi in synchronous

way?




[Qemu-devel] Does the event idx mechanism in virtio work in the correct way?

2014-10-14 Thread Bin Wu

Hi,
	The event idx in virtio is an effective way to reduce the number of interrupts 
and exits of the guest. When the guest puts an request into the virtio ring, it 
doesn't exit immediately to inform the backend. Instead, the guest checks the 
"avail" event idx to determine the notification. For example, assume that the 
guest already puts five requests in the ring and the backend gets two of these 
requests, then the "avail" event idx should be two. Now the guest puts another 
request into the ring. It checks the "avail" event idx and finds the event idx 
(equals to 2) is less than the number of available requests (equals to 6), so it 
will not notify the backend to work. The backend works in a loop to get the 
request from the ring until the ring is empty or all requests have been taken.

However, I find some code doesn't work in the way I thought:

int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
{
...
i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
vring_avail_event(vq, vring_avail_idx(vq));
}
...
}

In the sentence "vring_avail_event(vq, vring_avail_idx(vq));", I think the 
"avail" event idx should equal to the number of requests have been 
taken(vq->last_avail_idx), not the number of all available requests 
(vring_avail_idx(vq)). Is there any special consideration or do I just 
misunderstand the event idx?


thanks

Bin Wu





[Qemu-devel] Does the event idx mechanism in virtio work in the correct way?

2014-10-15 Thread Bin Wu
Hi,
The event idx in virtio is an effective way to reduce the number of interrupts
and exits of the guest. When the guest puts an request into the virtio ring, it
doesn't exit immediately to inform the backend. Instead, the guest checks the
"avail" event idx to determine the notification. For example, assume that the
guest already puts five requests in the ring and the backend gets two of these
requests, then the "avail" event idx should be two. Now the guest puts another
request into the ring. It checks the "avail" event idx and finds the event idx
(equals to 2) is less than the number of available requests (equals to 6), so it
will not notify the backend to work. The backend works in a loop to get the
request from the ring until the ring is empty or all requests have been taken.

However, I find some code doesn't work in the way I thought:

int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
{
...
i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
vring_avail_event(vq, vring_avail_idx(vq));
}
...
}

In the statement"vring_avail_event(vq, vring_avail_idx(vq));", I think the
"avail" event idx should equal to the number of requests have been
taken(vq->last_avail_idx), not the number of all available requests
(vring_avail_idx(vq)). Is there any special consideration or do I just
misunderstand the event idx?

thanks
-- 
Bin Wu




[Qemu-devel] [PATCH] hw/scsi/virtio-scsi.c: fix the "type" use error in virtio_scsi_handle_ctrl

2014-10-24 Thread Bin Wu
The local variable "type" in virtio_scsi_handle_ctl represents the tmf
command type from the guest and it has the same meaning as the
req->req.tmf.type. However, before the invoking of virtio_scsi_parse_req
the req->req.tmf.type doesn't has the correct value(just initialized to
zero). Therefore, we need to use the "type" variable to judge the case.

Signed-off-by: Bin Wu 
---
 hw/scsi/virtio-scsi.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index a1725b8..5742d39 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -378,8 +378,8 @@ void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
 return;
 }
 
-virtio_tswap32s(vdev, &req->req.tmf.type);
-if (req->req.tmf.type == VIRTIO_SCSI_T_TMF) {
+virtio_tswap32s(vdev, &type);
+if (type == VIRTIO_SCSI_T_TMF) {
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlTMFReq),
 sizeof(VirtIOSCSICtrlTMFResp)) < 0) {
 virtio_scsi_bad_req();
@@ -387,8 +387,8 @@ void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
 r = virtio_scsi_do_tmf(s, req);
 }
 
-} else if (req->req.tmf.type == VIRTIO_SCSI_T_AN_QUERY ||
-req->req.tmf.type == VIRTIO_SCSI_T_AN_SUBSCRIBE) {
+} else if (type == VIRTIO_SCSI_T_AN_QUERY ||
+type == VIRTIO_SCSI_T_AN_SUBSCRIBE) {
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlANReq),
 sizeof(VirtIOSCSICtrlANResp)) < 0) {
 virtio_scsi_bad_req();
-- 
1.7.12.4




[Qemu-devel] [PATCH] hw/virtio/virtio.c: fix the vring_avail_event error

2014-10-27 Thread Bin Wu
The event idx in virtio is an effective way to reduce the number of
interrupts and exits of the guest. When the guest puts an request
into the virtio ring, it doesn't exit immediately to inform the
backend. Instead, the guest checks the "avail" event idx to determine
the notification.

In virtqueue_pop, when a request is poped, the current avail event
idx should be set to the number of vq->last_avail_idx.

Signed-off-by: Bin Wu 
---
 hw/virtio/virtio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 2c236bf..013979a 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
 i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
 if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
-vring_avail_event(vq, vring_avail_idx(vq));
+vring_avail_event(vq, vq->last_avail_idx);
 }
 
 if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
-- 
1.7.12.4




Re: [Qemu-devel] [PATCH] hw/virtio/virtio.c: fix the vring_avail_event error

2014-10-28 Thread Bin Wu
On 2014/10/28 13:32, Michael S. Tsirkin wrote:
> On Tue, Oct 28, 2014 at 02:13:02AM +0000, Bin Wu wrote:
>> The event idx in virtio is an effective way to reduce the number of
>> interrupts and exits of the guest. When the guest puts an request
>> into the virtio ring, it doesn't exit immediately to inform the
>> backend. Instead, the guest checks the "avail" event idx to determine
>> the notification.
>>
>> In virtqueue_pop, when a request is poped, the current avail event
>> idx should be set to the number of vq->last_avail_idx.
>>
>> Signed-off-by: Bin Wu 
> 
> Does this fix some observable bug? Improve efficiency for some workload?
>
Improve efficiency, I test as follows:
(1)create a suse11sp3 VM with a virtio-blk test disk;
(2)use libaio library to submit 32 IOs concurrently (invoke io_sumbmit 32 times)
in the VM;
(3)modify the virtio-blk driver in the guest to record the host notification 
times;

without this patch, virtio-blk may notify the host between 10 and 30 times;
with this patch, virtio-blk just notify the host once during every test:)

>> ---
>>  hw/virtio/virtio.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 2c236bf..013979a 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>>  
>>  i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
>>  if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
>> -vring_avail_event(vq, vring_avail_idx(vq));
>> +vring_avail_event(vq, vq->last_avail_idx);
>>  }
>>  
>>  if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
>> -- 
>> 1.7.12.4
> 
> .
> 

-- 
Bin Wu




[Qemu-devel] [PATCH v2] hw/virtio/vring/event_idx: fix the vring_avail_event error

2014-10-31 Thread Bin Wu
The event idx in virtio is an effective way to reduce the number of
interrupts and exits of the guest. When the guest puts an request
into the virtio ring, it doesn't exit immediately to inform the
backend. Instead, the guest checks the "avail" event idx to determine
the notification.

In virtqueue_pop, when a request is poped, the current avail event
idx should be set to the number of vq->last_avail_idx.

Signed-off-by: Bin Wu 
---
V2 -> V1:
update the same code in hw/virtio/dataplane/vring.c (Stefan)
---
 hw/virtio/dataplane/vring.c | 8 
 hw/virtio/virtio.c  | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 372706a..61f6d83 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -352,10 +352,6 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 goto out;
 }
 
-if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
-vring_avail_event(&vring->vr) = vring->vr.avail->idx;
-}
-
 i = head;
 do {
 if (unlikely(i >= num)) {
@@ -392,6 +388,10 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 
 /* On success, increment avail index. */
 vring->last_avail_idx++;
+if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
+vring_avail_event(&vring->vr) = vring->last_avail_idx;
+}
+
 return head;
 
 out:
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 2c236bf..013979a 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
 i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
 if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
-vring_avail_event(vq, vring_avail_idx(vq));
+vring_avail_event(vq, vq->last_avail_idx);
 }
 
 if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_INDIRECT) {
-- 
1.7.12.4




Re: [Qemu-devel] [PATCH] hw/virtio/virtio.c: fix the vring_avail_event error

2014-10-31 Thread Bin Wu
On 2014/10/31 0:48, Stefan Hajnoczi wrote:
> On Tue, Oct 28, 2014 at 02:13:02AM +0000, Bin Wu wrote:
>> The event idx in virtio is an effective way to reduce the number of
>> interrupts and exits of the guest. When the guest puts an request
>> into the virtio ring, it doesn't exit immediately to inform the
>> backend. Instead, the guest checks the "avail" event idx to determine
>> the notification.
>>
>> In virtqueue_pop, when a request is poped, the current avail event
>> idx should be set to the number of vq->last_avail_idx.
>>
>> Signed-off-by: Bin Wu 
>> ---
>>  hw/virtio/virtio.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 2c236bf..013979a 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -469,7 +469,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
>>  
>>  i = head = virtqueue_get_head(vq, vq->last_avail_idx++);
>>  if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
>> -vring_avail_event(vq, vring_avail_idx(vq));
>> +vring_avail_event(vq, vq->last_avail_idx);
>>  }
> 
> hw/virtio/dataplane/vring.c:
> 
>   if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
>   vring_avail_event(&vring->vr) = vring->vr.avail->idx;
>   }
> 
> This is equivalent to the virtio.c code you are replacing, so it should
> be updated too.
> 
> Stefan
> 
Thanks, stefan. I'll send another patch later.
-- 
Bin Wu



Re: [Qemu-devel] [PATCH v5 00/22] block: Asynchronous request cancellation

2014-09-10 Thread Bin Wu

On 2014/9/10 13:59, Fam Zheng wrote:

v5: Fix IDE callback. (Paolo)
 Fix blkdebug. (Paolo)
 Drop the DMA fix which is independent of this series. (Paolo)
 Incorperate Yuan's patch on quorum_aio_cancel. (Benoît)
 Commit message wording fix. (Benoît)
 Rename qemu_aio_release to qemu_aio_unref. (Benoît)

v4: Drop AIOCBInfo.cancel.

This series adds a new block layer API:

   void bdrv_aio_cancel_async(BlockDriverAIOCB *acb);

And use it to emulate bdrv_aio_cancel.

The function is similar to bdrv_aio_cancel in that it cancels an AIO request,
but different that it doesn't block until the request is completely cancelled
or done.

More importantly, the completion callback, BlockDriverAIOCB.cb, is guaranteed
to be called, so that the cb can take care of resource releasing and status
reporting to guest, etc.

In the following work, scsi emulation code will be shifted to use the async
cancelling.

One major benefit would be that when guest tries to cancel a request, where the
request cannot be cancelled easily, (due to throttled BlockDriverState, a lost
connection, or a large request queue), we don't need to block the whole vm with
a busy loop, which is how bdrv_aio_cancel is implemented now.


First, I think this series is really useful. However, I tested the v4 
series and found virtio-scsi disk(scsi-hd) was still blocked when the IO 
could not come back because of 
virtio_scsi_do_tmf->scsi_cancel_io->bdrv_aio_cancel. can we just change 
the bdrv_aio_cancel to bdrv_aio_cancel_async to solve this problem?




A test case that is easy to reproduce is, throttle a scsi-disk to a very low
limit, for example 50 bps, then stress the guest block device with dd or fio.

Currently, the vm will quickly hang when it loses patience and send a tmf
command to cancel the request, at which point we will busy wait in
bdrv_aio_cancel, until the request is slowly spit out from throttled_reqs.

Later, we will change scsi device code to make this asynchronous, on top of
bdrv_aio_cancel_async.

Fam


Fam Zheng (21):
   ide/ahci: Check for -ECANCELED in aio callbacks
   block: Add refcnt in BlockDriverAIOCB
   block: Add bdrv_aio_cancel_async
   block: Drop bdrv_em_co_aiocb_info.cancel
   block: Convert bdrv_em_aiocb_info.cancel to .cancel_async
   thread-pool: Convert thread_pool_aiocb_info.cancel to cancel_async
   linux-aio: Convert laio_aiocb_info.cancel to .cancel_async
   dma: Convert dma_aiocb_info.cancel to .cancel_async
   iscsi: Convert iscsi_aiocb_info.cancel to .cancel_async
   archipelago: Drop archipelago_aiocb_info.cancel
   blkdebug: Drop blkdebug_aiocb_info.cancel
   blkverify: Drop blkverify_aiocb_info.cancel
   curl: Drop curl_aiocb_info.cancel
   qed: Drop qed_aiocb_info.cancel
   quorum: Convert quorum_aiocb_info.cancel to .cancel_async
   rbd: Drop rbd_aiocb_info.cancel
   sheepdog: Convert sd_aiocb_info.cancel to .cancel_async
   win32-aio: Drop win32_aiocb_info.cancel
   ide: Convert trim_aiocb_info.cancel to .cancel_async
   block: Drop AIOCBInfo.cancel
   block: Rename qemu_aio_release -> qemu_aio_unref

Liu Yuan (1):
   quorum: fix quorum_aio_cancel()

  block.c  | 69 
  block/archipelago.c  | 19 ++---
  block/blkdebug.c | 17 ++--
  block/blkverify.c| 21 +--
  block/curl.c | 16 ---
  block/iscsi.c| 23 
  block/linux-aio.c| 34 +++-
  block/qed.c  | 23 +---
  block/quorum.c   | 11 
  block/rbd.c  | 25 ++
  block/sheepdog.c | 54 -
  block/win32-aio.c| 18 ++---
  dma-helpers.c| 20 +++---
  hw/ide/ahci.c|  3 +++
  hw/ide/core.c| 26 --
  include/block/aio.h  |  7 +++--
  include/block/block.h|  1 +
  tests/test-thread-pool.c | 34 ++--
  thread-pool.c| 36 +++--
  19 files changed, 172 insertions(+), 285 deletions(-)






[Qemu-devel] [PATCH] qemu-coroutine: segfault when restarting co_queue

2015-02-08 Thread Bin Wu
From: Bin Wu 

We tested VMs migration with their disk images by drive_mirror. With 
migration, two VMs copyed large files between each other. During the
test, a segfault occured. The stack was as follow:

(gdb) bt
#0  0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at
qemu-coroutine-lock.c:66
#1  0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160,
to=0x7fa5a1798648) at qemu-coroutine.c:97
#2  0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140
#3  0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0,
request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
block/nbd-client.c:165
#4  0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0,
sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
block/nbd-client.c:262
#5  0x7fa5a0c9e9dd in nbd_client_session_co_writev (client=0x7fa5a1a3cfd0,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
block/nbd-client.c:296
#6  0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
#7  0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0,
req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
#8  0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0,
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
#9  0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
#10 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
#11 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
#12 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110,
req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
#13 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110,
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
#14 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
#15 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at block.c:4706
#16 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at
coroutine-ucontext.c:121
#17 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from /lib64/libc.so.6
#18 0x in ?? ()

After analyzing the stack and reviewing the code, we find the
qemu_co_queue_run_restart should not be put in the coroutine_swap function which
can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
qemu_coroutine_enter needs to restart the co_queue. The error scenario is given
in the following patch log.

Bin Wu (1):
  qemu-coroutine: fix qemu_co_queue_run_restart error

 qemu-coroutine.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

-- 
1.7.12.4





[Qemu-devel] [PATCH] qemu-coroutine: fix qemu_co_queue_run_restart error

2015-02-08 Thread Bin Wu
From: Bin Wu 

The error scenario is as follow: coroutine C1 enters C2, C2 yields
back to C1, then C1 ternimates and the related coroutine memory
becomes invalid. After a while, the C2 coroutine is entered again.
At this point, C1 is used as a parameter passed to
qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
accesses an invalid memory and a segfault error ocurrs.

The qemu_co_queue_run_restart function re-enters coroutines waiting
in the co_queue. However, this function should be only used int the
qemu_coroutine_enter context. Only in this context, when the current
coroutine gets execution control again(after the execution of
qemu_coroutine_switch), we can restart the target coutine because the
target coutine has yielded back to the current coroutine or it has
terminated.

First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
but we find we can not access the target coroutine if it terminates.

Signed-off-by: Bin Wu 
---
 qemu-coroutine.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 525247b..9a294c4 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -99,24 +99,25 @@ static void coroutine_delete(Coroutine *co)
 qemu_coroutine_delete(co);
 }
 
-static void coroutine_swap(Coroutine *from, Coroutine *to)
+static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to)
 {
 CoroutineAction ret;
 
 ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
 
-qemu_co_queue_run_restart(to);
-
 switch (ret) {
 case COROUTINE_YIELD:
-return;
+break;
 case COROUTINE_TERMINATE:
 trace_qemu_coroutine_terminate(to);
+qemu_co_queue_run_restart(to);
 coroutine_delete(to);
-return;
+break;
 default:
 abort();
 }
+
+return ret;
 }
 
 void qemu_coroutine_enter(Coroutine *co, void *opaque)
@@ -133,6 +134,8 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque)
 co->caller = self;
 co->entry_arg = opaque;
 coroutine_swap(self, co);
+if (ret == COROUTINE_YIELD)
+qemu_co_queue_run_restart(co);
 }
 
 void coroutine_fn qemu_coroutine_yield(void)
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH] qemu-coroutine: fix qemu_co_queue_run_restart error

2015-02-08 Thread Bin Wu
sorry, there is a mistake in this patch: the "ret" variable is not
defined :<

I will send a new patch to fix this problem.

On 2015/2/9 12:09, Bin Wu wrote:
> From: Bin Wu 
> 
> The error scenario is as follow: coroutine C1 enters C2, C2 yields
> back to C1, then C1 ternimates and the related coroutine memory
> becomes invalid. After a while, the C2 coroutine is entered again.
> At this point, C1 is used as a parameter passed to
> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
> accesses an invalid memory and a segfault error ocurrs.
> 
> The qemu_co_queue_run_restart function re-enters coroutines waiting
> in the co_queue. However, this function should be only used int the
> qemu_coroutine_enter context. Only in this context, when the current
> coroutine gets execution control again(after the execution of
> qemu_coroutine_switch), we can restart the target coutine because the
> target coutine has yielded back to the current coroutine or it has
> terminated.
> 
> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
> but we find we can not access the target coroutine if it terminates.
> 
> Signed-off-by: Bin Wu 
> ---
>  qemu-coroutine.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/qemu-coroutine.c b/qemu-coroutine.c
> index 525247b..9a294c4 100644
> --- a/qemu-coroutine.c
> +++ b/qemu-coroutine.c
> @@ -99,24 +99,25 @@ static void coroutine_delete(Coroutine *co)
>  qemu_coroutine_delete(co);
>  }
>  
> -static void coroutine_swap(Coroutine *from, Coroutine *to)
> +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to)
>  {
>  CoroutineAction ret;
>  
>  ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
>  
> -qemu_co_queue_run_restart(to);
> -
>  switch (ret) {
>  case COROUTINE_YIELD:
> -return;
> +break;
>  case COROUTINE_TERMINATE:
>  trace_qemu_coroutine_terminate(to);
> +qemu_co_queue_run_restart(to);
>  coroutine_delete(to);
> -return;
> +break;
>  default:
>  abort();
>  }
> +
> +return ret;
>  }
>  
>  void qemu_coroutine_enter(Coroutine *co, void *opaque)
> @@ -133,6 +134,8 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque)
>  co->caller = self;
>      co->entry_arg = opaque;
>  coroutine_swap(self, co);
> +if (ret == COROUTINE_YIELD)
> +qemu_co_queue_run_restart(co);
>  }
>  
>  void coroutine_fn qemu_coroutine_yield(void)
> 

-- 
Bin Wu




[Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue

2015-02-08 Thread Bin Wu
From: Bin Wu 

We tested VMs migration with their disk images by drive_mirror. With
migration, two VMs copyed large files between each other. During the
test, a segfault occured. The stack was as follow:

(gdb) bt
qemu-coroutine-lock.c:66
to=0x7fa5a1798648) at qemu-coroutine.c:97
request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
block/nbd-client.c:165
sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
block/nbd-client.c:262
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
block/nbd-client.c:296
nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
coroutine-ucontext.c:121

After analyzing the stack and reviewing the code, we find the
qemu_co_queue_run_restart should not be put in the coroutine_swap function which
can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
qemu_coroutine_enter needs to restart the co_queue.

The error scenario is as follow: coroutine C1 enters C2, C2 yields
back to C1, then C1 ternimates and the related coroutine memory
becomes invalid. After a while, the C2 coroutine is entered again.
At this point, C1 is used as a parameter passed to
qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
accesses an invalid memory and a segfault error ocurrs.

The qemu_co_queue_run_restart function re-enters coroutines waiting
in the co_queue. However, this function should be only used int the
qemu_coroutine_enter context. Only in this context, when the current
coroutine gets execution control again(after the execution of
qemu_coroutine_switch), we can restart the target coutine because the
target coutine has yielded back to the current coroutine or it has
terminated.

First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
but we find we can not access the target coroutine if it terminates.

Signed-off-by: Bin Wu 
---
 qemu-coroutine.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 525247b..cc0bdfa 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co)
 qemu_coroutine_delete(co);
 }
 
-static void coroutine_swap(Coroutine *from, Coroutine *to)
+static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to)
 {
 CoroutineAction ret;
 
 ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
 
-qemu_co_queue_run_restart(to);
-
 switch (ret) {
 case COROUTINE_YIELD:
-return;
+break;
 case COROUTINE_TERMINATE:
 trace_qemu_coroutine_terminate(to);
+qemu_co_queue_run_restart(to);
 coroutine_delete(to);
-return;
+break;
 default:
 abort();
 }
+
+return ret;
 }
 
 void qemu_coroutine_enter(Coroutine *co, void *opaque)
 {
 Coroutine *self = qemu_coroutine_self();
+CoroutineAction ret;
 
 trace_qemu_coroutine_enter(self, co, opaque);
 
@@ -132,7 +134,9 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque)
 
 co->caller = self;
 co->entry_arg = opaque;
-coroutine_swap(self, co);
+ret = coroutine_swap(self, co);
+if (ret == COROUTINE_YIELD)
+qemu_co_queue_run_restart(co);
 }
 
 void coroutine_fn qemu_coroutine_yield(void)
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH] qemu-coroutine-lock: fix co_queue multi-adding bug

2015-02-09 Thread Bin Wu
On 2015/2/9 16:12, Fam Zheng wrote:
> On Sat, 02/07 17:51, w00214312 wrote:
>> From: Bin Wu 
>>
>> When a coroutine holds a lock, other coroutines who want to get
>> the lock must wait on a co_queue by adding themselves to the
>> CoQueue. However, if a waiting coroutine is woken up with the
>> lock still be holding by other coroutine, this waiting coroutine
> 
> Could you explain who wakes up the waiting coroutine? Maybe the bug is that
> it shouldn't be awaken in the first place.
> 

During the mirror phase with nbd devices, if we send a cancel command or
physical network breaks down, the source qemu process will receive a readable
event and the main loop will invoke nbd_reply_ready to deal with it. This
function finds the connection is down and then goes into
nbd_teardown_connection. nbd_teardown_connection wakes up all working coroutines
by nbd_recv_coroutines_enter_all. These coroutines include the one which holds
the sending lock, the ones which wait for the lock, and the ones which wait for
receiving messages.

I think the purpose of nbd_recv_coroutines_enter_all is to terminate all waiting
coroutines by waking all of them up. If the coroutine waiting for the lock is
allowed for waking up, this implementation is ok. If not, we need to distinguish
the coroutines waiting for receiving messages from the ones waiting for the 
lock.

In my option, if the coroutines waiting for a lock is allowd for waking up, it
should be more robust :>

>> will add itself to the co_queue again. Latter, when the lock
>> is released, a coroutine re-enter will occur.
>>
>> We need to determine whether a coroutine is alread in the co_queue
> 
> s/alread/already/
> 
> Fam
> 

Thanks, my mistake.

>> before adding it to the waiting queue.
>>
>> Signed-off-by: Bin Wu 
>> ---
>>  include/block/coroutine_int.h | 1 +
>>  qemu-coroutine-lock.c | 6 +-
>>  qemu-coroutine.c  | 1 +
>>  3 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/block/coroutine_int.h b/include/block/coroutine_int.h
>> index f133d65..c524990 100644
>> --- a/include/block/coroutine_int.h
>> +++ b/include/block/coroutine_int.h
>> @@ -42,6 +42,7 @@ struct Coroutine {
>>  /* Coroutines that should be woken up when we yield or terminate */
>>  QTAILQ_HEAD(, Coroutine) co_queue_wakeup;
>>  QTAILQ_ENTRY(Coroutine) co_queue_next;
>> +bool in_co_queue;
>>  };
>>  
>>  Coroutine *qemu_coroutine_new(void);
>> diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c
>> index e4860ae..d256f53 100644
>> --- a/qemu-coroutine-lock.c
>> +++ b/qemu-coroutine-lock.c
>> @@ -36,7 +36,10 @@ void qemu_co_queue_init(CoQueue *queue)
>>  void coroutine_fn qemu_co_queue_wait(CoQueue *queue)
>>  {
>>  Coroutine *self = qemu_coroutine_self();
>> -QTAILQ_INSERT_TAIL(&queue->entries, self, co_queue_next);
>> +if (!self->in_co_queue) {
>> +QTAILQ_INSERT_TAIL(&queue->entries, self, co_queue_next);
>> +self->in_co_queue = true;
>> +}
>>  qemu_coroutine_yield();
>>  assert(qemu_in_coroutine());
>>  }
>> @@ -71,6 +74,7 @@ static bool qemu_co_queue_do_restart(CoQueue *queue, bool 
>> single)
>>  
>>  while ((next = QTAILQ_FIRST(&queue->entries)) != NULL) {
>>  QTAILQ_REMOVE(&queue->entries, next, co_queue_next);
>> +next->in_co_queue = false;
>>  QTAILQ_INSERT_TAIL(&self->co_queue_wakeup, next, co_queue_next);
>>  trace_qemu_co_queue_next(next);
>>          if (single) {
>> diff --git a/qemu-coroutine.c b/qemu-coroutine.c
>> index 525247b..a103721 100644
>> --- a/qemu-coroutine.c
>> +++ b/qemu-coroutine.c
>> @@ -75,6 +75,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
>>  }
>>  
>>  co->entry = entry;
>> +co->in_co_queue = false;
>>  QTAILQ_INIT(&co->co_queue_wakeup);
>>  return co;
>>  }
>> -- 
>> 1.7.12.4
>>
>>
> 
> .
> 

-- 
Bin Wu




Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug

2015-02-09 Thread Bin Wu
On 2015/2/9 17:23, Paolo Bonzini wrote:
> 
> 
> On 07/02/2015 10:51, w00214312 wrote:
>> From: Bin Wu 
>>
>> When we test the drive_mirror between different hosts by ndb devices, 
>> we find that, during the cancel phase the qemu process crashes sometimes.
>> By checking the crash core file, we find the stack as follows, which means
>> a coroutine re-enter error occurs:
> 
> This bug probably can be fixed simply by delaying the setting of
> recv_coroutine.
> 
> What are the symptoms if you only apply your "qemu-coroutine-lock: fix
> co_queue multi-adding bug" patch but not "qemu-coroutine: fix
> qemu_co_queue_run_restart error"?
> 
> Can you try the patch below?  (Compile-tested only).
> 

yes, I think this patch can solve the problem too. I will try the patch latter.

> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 6e1c97c..23d6a71 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s,
>  QEMUIOVector *qiov, int offset)
>  {
>  AioContext *aio_context;
> -int rc, ret;
> +int rc, ret, i;
>  
>  qemu_co_mutex_lock(&s->send_mutex);
> +
> +for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> +if (s->recv_coroutine[i] == NULL) {
> +s->recv_coroutine[i] = qemu_coroutine_self();
> +break;
> +}
> +}
> +
> +assert(i < MAX_NBD_REQUESTS);
> +request->handle = INDEX_TO_HANDLE(s, i);
>  s->send_coroutine = qemu_coroutine_self();
> +
>  aio_context = bdrv_get_aio_context(s->bs);
>  aio_set_fd_handler(aio_context, s->sock,
> nbd_reply_ready, nbd_restart_write, s);
> @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s,
>  static void nbd_coroutine_start(NbdClientSession *s,
> struct nbd_request *request)
>  {
> -int i;
> -
>  /* Poor man semaphore.  The free_sema is locked when no other request
>   * can be accepted, and unlocked after receiving one reply.  */
>  if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
> @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
>  }
>  s->in_flight++;
>  
> -for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> -if (s->recv_coroutine[i] == NULL) {
> -s->recv_coroutine[i] = qemu_coroutine_self();
> -        break;
> -}
> -}
> -
> -assert(i < MAX_NBD_REQUESTS);
> -request->handle = INDEX_TO_HANDLE(s, i);
> +/* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
>  }
>  
>  static void nbd_coroutine_end(NbdClientSession *s,
> 
> 
> 

-- 
Bin Wu




Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue

2015-02-09 Thread Bin Wu
On 2015/2/9 22:48, Stefan Hajnoczi wrote:
> On Mon, Feb 09, 2015 at 02:50:39PM +0800, Bin Wu wrote:
>> From: Bin Wu 
>>
>> We tested VMs migration with their disk images by drive_mirror. With
>> migration, two VMs copyed large files between each other. During the
>> test, a segfault occured. The stack was as follow:
>>
>> (gdb) bt
>> qemu-coroutine-lock.c:66
>> to=0x7fa5a1798648) at qemu-coroutine.c:97
>> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
>> block/nbd-client.c:165
>> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
>> block/nbd-client.c:262
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
>> block/nbd-client.c:296
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
>> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>> 0)) at
>> block.c:3471
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
>> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>> 0)) at
>> block.c:3471
>> coroutine-ucontext.c:121
> 
> This backtrace is incomplete.  Where are the function names?  The
> parameter lists appear incomplete too.
> 

I put the stack in the git log, so some lines are missed:
(gdb) bt
#0  0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at
qemu-coroutine-lock.c:66
#1  0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160,
to=0x7fa5a1798648) at qemu-coroutine.c:97
#2  0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140
#3  0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0,
request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
block/nbd-client.c:165
#4  0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0,
sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
block/nbd-client.c:262
#5  0x7fa5a0c9e9dd in nbd_client_session_co_writev (client=0x7fa5a1a3cfd0,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
block/nbd-client.c:296
#6  0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
#7  0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0,
req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
#8  0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0,
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
#9  0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
#10 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
#11 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
#12 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110,
req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
#13 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110,
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
#14 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
#15 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at block.c:4706
#16 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at
coroutine-ucontext.c:121
#17 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from /lib64/libc.so.6
#18 0x in ?? ()


>> After analyzing the stack and reviewing the code, we find the
>> qemu_co_queue_run_restart should not be put in the coroutine_swap function 
>> which
>> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
>> qemu_coroutine_enter needs to restart the co_queue.
>>
>> The error scenario is as follow: coroutine C1 enters C2, C2 yields
>> back to C1, then C1 ternimates and the related coroutine memory
>> becomes invalid. After a while, the C2 coroutine is entered again.
>> At this point, C1 is used as a parameter passed to
>> qemu_co_queue_run_restart. Therefore, qemu

Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue

2015-02-09 Thread Bin Wu
On 2015/2/9 17:09, Paolo Bonzini wrote:
> 
> 
> On 09/02/2015 07:50, Bin Wu wrote:
>> From: Bin Wu 
>>
>> We tested VMs migration with their disk images by drive_mirror. With
>> migration, two VMs copyed large files between each other. During the
>> test, a segfault occured. The stack was as follow:
>>
>> (gdb) bt
>> qemu-coroutine-lock.c:66
>> to=0x7fa5a1798648) at qemu-coroutine.c:97
>> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
>> block/nbd-client.c:165
>> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
>> block/nbd-client.c:262
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
>> block/nbd-client.c:296
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
>> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>> 0)) at
>> block.c:3471
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
>> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>> 0)) at
>> block.c:3471
>> coroutine-ucontext.c:121
>>
>> After analyzing the stack and reviewing the code, we find the
>> qemu_co_queue_run_restart should not be put in the coroutine_swap function 
>> which
>> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
>> qemu_coroutine_enter needs to restart the co_queue.
>>
>> The error scenario is as follow: coroutine C1 enters C2, C2 yields
>> back to C1, then C1 ternimates and the related coroutine memory
>> becomes invalid. After a while, the C2 coroutine is entered again.
>> At this point, C1 is used as a parameter passed to
>> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
>> accesses an invalid memory and a segfault error ocurrs.
>>
>> The qemu_co_queue_run_restart function re-enters coroutines waiting
>> in the co_queue. However, this function should be only used int the
>> qemu_coroutine_enter context. Only in this context, when the current
>> coroutine gets execution control again(after the execution of
>> qemu_coroutine_switch), we can restart the target coutine because the
>> target coutine has yielded back to the current coroutine or it has
>> terminated.
> 
> qemu_coroutine_yield can be executed for other reasons than locks.  In
> those cases, it is correct to call qemu_co_queue_run_restart.  I think
> it's an NBD bug.
> 
> Paolo
> 

Maybe I didn't describe the error scenario clearly, but it's a normal coroutine
using case, not a NBD bug. Please reference stefan's reply, thanks.

>> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
>> but we find we can not access the target coroutine if it terminates.
>>
>> Signed-off-by: Bin Wu 
>> ---
>>  qemu-coroutine.c | 16 ++--
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/qemu-coroutine.c b/qemu-coroutine.c
>> index 525247b..cc0bdfa 100644
>> --- a/qemu-coroutine.c
>> +++ b/qemu-coroutine.c
>> @@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co)
>>  qemu_coroutine_delete(co);
>>  }
>>  
>> -static void coroutine_swap(Coroutine *from, Coroutine *to)
>> +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to)
>>  {
>>  CoroutineAction ret;
>>  
>>  ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
>>  
>> -qemu_co_queue_run_restart(to);
>> -
>>  switch (ret) {
>>  case COROUTINE_YIELD:
>> -return;
>> +break;
>>  case COROUTINE_TERMINATE:
>>  trace_qemu_coroutine_terminate(to);
>> +qemu_co_queue_run_restart(to);
>>  coroutine_delete(to);
>> -return;
>> +break;
>>  default:
>>  abort();
>>  }
>> +
>> +    return ret;
>>  }
>>  
>>  void qemu_coroutine_enter(Coroutine *co, void *opaque)
>>  {
>>  Coroutine *self = qemu_coroutine_self();
>> +CoroutineAction ret;
>>  
>>  trace_qemu_coroutine_enter(self, co, opaque);
>>  
>> @@ -132,7 +134,9 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque)
>>  
>>  co->caller = self;
>>  co->entry_arg = opaque;
>> -coroutine_swap(self, co);
>> +ret = coroutine_swap(self, co);
>> +if (ret == COROUTINE_YIELD)
>> +qemu_co_queue_run_restart(co);
>>  }
>>  
>>  void coroutine_fn qemu_coroutine_yield(void)
>>
> 
> 
> .
> 

-- 
Bin Wu




Re: [Qemu-devel] [PATCH] qemu-coroutine-lock: fix co_queue multi-adding bug

2015-02-09 Thread Bin Wu
On 2015/2/9 18:12, Kevin Wolf wrote:
> Am 09.02.2015 um 10:36 hat Bin Wu geschrieben:
>> On 2015/2/9 16:12, Fam Zheng wrote:
>>> On Sat, 02/07 17:51, w00214312 wrote:
>>>> From: Bin Wu 
>>>>
>>>> When a coroutine holds a lock, other coroutines who want to get
>>>> the lock must wait on a co_queue by adding themselves to the
>>>> CoQueue. However, if a waiting coroutine is woken up with the
>>>> lock still be holding by other coroutine, this waiting coroutine
>>>
>>> Could you explain who wakes up the waiting coroutine? Maybe the bug is that
>>> it shouldn't be awaken in the first place.
>>>
>>
>> During the mirror phase with nbd devices, if we send a cancel command or
>> physical network breaks down, the source qemu process will receive a readable
>> event and the main loop will invoke nbd_reply_ready to deal with it. This
>> function finds the connection is down and then goes into
>> nbd_teardown_connection. nbd_teardown_connection wakes up all working 
>> coroutines
>> by nbd_recv_coroutines_enter_all. These coroutines include the one which 
>> holds
>> the sending lock, the ones which wait for the lock, and the ones which wait 
>> for
>> receiving messages.
> 
> This is the bug. It's not allowed to reenter a coroutine if you don't
> know its state. NBD needs a fix, not the the generic coroutine
> infrastructure.
> 
> If we want to change anything in the lock implementation, it should be
> adding an assertion to catch such violations of the rule. (Untested, but
> I think the assertion should hold true.)
> 
> Kevin
> 
> diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c
> index e4860ae..25fc111 100644
> --- a/qemu-coroutine-lock.c
> +++ b/qemu-coroutine-lock.c
> @@ -123,9 +123,8 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex)
>  
>  trace_qemu_co_mutex_lock_entry(mutex, self);
>  
> -while (mutex->locked) {
> -qemu_co_queue_wait(&mutex->queue);
> -    }
> +qemu_co_queue_wait(&mutex->queue);
> +assert(!mutex->locked);
>  
>  mutex->locked = true;
> 
> .
> 

I see. paolo's patch in his first reply can fix the bug in NBD (not in
coroutine). I will complete that patch to do some tests and send another version
latter.

-- 
Bin Wu




Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue

2015-02-09 Thread Bin Wu
On 2015/2/10 11:16, Wen Congyang wrote:
> On 02/09/2015 10:48 PM, Stefan Hajnoczi wrote:
>> On Mon, Feb 09, 2015 at 02:50:39PM +0800, Bin Wu wrote:
>>> From: Bin Wu 
>>>
>>> We tested VMs migration with their disk images by drive_mirror. With
>>> migration, two VMs copyed large files between each other. During the
>>> test, a segfault occured. The stack was as follow:
>>>
>>> (gdb) bt
>>> qemu-coroutine-lock.c:66
>>> to=0x7fa5a1798648) at qemu-coroutine.c:97
>>> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
>>> block/nbd-client.c:165
>>> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
>>> block/nbd-client.c:262
>>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
>>> block/nbd-client.c:296
>>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
>>> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>>> flags=0) at block.c:3321
>>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) 
>>> at
>>> block.c:3447
>>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>>> 0)) at
>>> block.c:3471
>>> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
>>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
>>> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>>> flags=0) at block.c:3321
>>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) 
>>> at
>>> block.c:3447
>>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>>> 0)) at
>>> block.c:3471
>>> coroutine-ucontext.c:121
>>
>> This backtrace is incomplete.  Where are the function names?  The
>> parameter lists appear incomplete too.
>>
>>> After analyzing the stack and reviewing the code, we find the
>>> qemu_co_queue_run_restart should not be put in the coroutine_swap function 
>>> which
>>> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
>>> qemu_coroutine_enter needs to restart the co_queue.
>>>
>>> The error scenario is as follow: coroutine C1 enters C2, C2 yields
>>> back to C1, then C1 ternimates and the related coroutine memory
>>> becomes invalid. After a while, the C2 coroutine is entered again.
>>> At this point, C1 is used as a parameter passed to
>>> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
>>> accesses an invalid memory and a segfault error ocurrs.
>>>
>>> The qemu_co_queue_run_restart function re-enters coroutines waiting
>>> in the co_queue. However, this function should be only used int the
>>> qemu_coroutine_enter context. Only in this context, when the current
>>> coroutine gets execution control again(after the execution of
>>> qemu_coroutine_switch), we can restart the target coutine because the
>>> target coutine has yielded back to the current coroutine or it has
>>> terminated.
>>>
>>> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
>>> but we find we can not access the target coroutine if it terminates.
>>
>> This example captures the scenario you describe:
>>
>> diff --git a/qemu-coroutine.c b/qemu-coroutine.c
>> index 525247b..883cbf5 100644
>> --- a/qemu-coroutine.c
>> +++ b/qemu-coroutine.c
>> @@ -103,7 +103,10 @@ static void coroutine_swap(Coroutine *from, Coroutine 
>> *to)
>>  {
>>  CoroutineAction ret;
>>  
>> +fprintf(stderr, "> %s from %p to %p\n", __func__, from, to);
>>  ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
>> +fprintf(stderr, "< %s from %p to %p switch %s\n", __func__, from, to,
>> +ret == COROUTINE_YIELD ? "yield" : "terminate");
>>  
>>  qemu_co_queue_run_restart(to);
>>  
>> @@ -111,6 +114,7 @@ static void coroutine_swap(Coroutine *from, Coroutine 
>> *to)
>>  case COROUTINE_YIELD:
>>  return;
>>  case COROUTINE_TERMINATE:
>> +fprintf(stderr, "coroutine_delete %p\n", to);
>>  trace_qemu_coroutine_terminate(to);
>>  coroutine_delete(to);
>>  return;
>> diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
>> index 27d1b6f..d44c428 100644
>> --- a/tests/test-coroutine.c
>> +++ b/tests/test-coroutine.c
>> @@ -13,6 +13,7 @@
>>  
>&g

[Qemu-devel] [PATCH v3] qemu-coroutine: segfault when restarting co_queue

2015-02-09 Thread Bin Wu
From: Bin Wu 

We tested VMs migration with their disk images by drive_mirror. With
migration, two VMs copyed large files between each other. During the
test, a segfault occured. The stack was as follow:

00)  0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at
qemu-coroutine-lock.c:66
01)  0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160,
to=0x7fa5a1798648) at qemu-coroutine.c:97
02)  0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140
03)  0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0,
request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
block/nbd-client.c:165
04)  0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0,
sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
block/nbd-client.c:262
05)  0x7fa5a0c9e9dd in nbd_client_session_co_writev (client=0x7fa5a1a3cfd0,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
block/nbd-client.c:296
06)  0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
07)  0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0,
req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
08)  0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0,
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
09)  0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
10) 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
11) 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, sector_num=8552704,
nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
12) 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110,
req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
flags=0) at block.c:3321
13) 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110,
offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3447
14) 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110,
sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
block.c:3471
15) 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at block.c:4706
16) 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at
coroutine-ucontext.c:121
17) 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from /lib64/libc.so.6
18) 0x in ?? ()

After analyzing the stack and reviewing the code, we find the
qemu_co_queue_run_restart should not be put in the coroutine_swap function which
can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
qemu_coroutine_enter needs to restart the co_queue.

The error scenario is as follow: coroutine C1 enters C2, C2 yields
back to C1, then C1 ternimates and the related coroutine memory
becomes invalid. After a while, the C2 coroutine is entered again.
At this point, C1 is used as a parameter passed to
qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
accesses an invalid memory and a segfault error ocurrs.

The qemu_co_queue_run_restart function re-enters coroutines waiting
in the co_queue. However, this function should be only used int the
qemu_coroutine_enter context. Only in this context, when the current
coroutine gets execution control again(after the execution of
qemu_coroutine_switch), we can restart the target coutine because the
target coutine has yielded back to the current coroutine or it has
terminated.

First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
but we find we can not access the target coroutine if it terminates.

Signed-off-by: Bin Wu 
---
v3: add a test case in coroutine test code provided by stefan.

v2: fix undefined variable.
---
 qemu-coroutine.c   | 17 +++--
 tests/test-coroutine.c | 31 +++
 2 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 525247b..c8e280d 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co)
 qemu_coroutine_delete(co);
 }
 
-static void coroutine_swap(Coroutine *from, Coroutine *to)
+static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to)
 {
 CoroutineAction ret;
 
 ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
 
-qemu_co_queue_run_restart(to);
-
 switch (ret) {
 case COROUTINE_YIELD:
-return;
+break;
 case COROUTINE_TERMINATE:
 trace_qemu_coroutine_terminate(to);
+qemu_co_queue_run_restart(to);
 coroutine_delete(to);
-return;
+break;
 default:
 abort();
 }
+
+return ret;
 }
 
 void qemu_coroutine_enter(Coroutine *co, void *opaque

Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug

2015-02-09 Thread Bin Wu
On 2015/2/9 17:23, Paolo Bonzini wrote:
> 
> 
> On 07/02/2015 10:51, w00214312 wrote:
>> From: Bin Wu 
>>
>> When we test the drive_mirror between different hosts by ndb devices, 
>> we find that, during the cancel phase the qemu process crashes sometimes.
>> By checking the crash core file, we find the stack as follows, which means
>> a coroutine re-enter error occurs:
> 
> This bug probably can be fixed simply by delaying the setting of
> recv_coroutine.
> 
> What are the symptoms if you only apply your "qemu-coroutine-lock: fix
> co_queue multi-adding bug" patch but not "qemu-coroutine: fix
> qemu_co_queue_run_restart error"?

These two patches are used to solve two different problems:
-"qemu-coroutine-lock: fix co_queue multi-adding bug" solves the coroutine
re-enter problem which is found when we send a cancel command after the
drive_mirror is just started.
-"qemu-coroutine: fix qemu_co_queue_run_restart error" solves the segfault
problem during drive_mirror phase of two VMs which copy large files between each
other.

> 
> Can you try the patch below?  (Compile-tested only).
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 6e1c97c..23d6a71 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s,
>  QEMUIOVector *qiov, int offset)
>  {
>  AioContext *aio_context;
> -int rc, ret;
> +int rc, ret, i;
>  
>  qemu_co_mutex_lock(&s->send_mutex);
> +
> +for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> +if (s->recv_coroutine[i] == NULL) {
> +s->recv_coroutine[i] = qemu_coroutine_self();
> +break;
> +}
> +}
> +
> +assert(i < MAX_NBD_REQUESTS);
> +request->handle = INDEX_TO_HANDLE(s, i);
>  s->send_coroutine = qemu_coroutine_self();
> +
>  aio_context = bdrv_get_aio_context(s->bs);
>  aio_set_fd_handler(aio_context, s->sock,
> nbd_reply_ready, nbd_restart_write, s);
> @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s,
>  static void nbd_coroutine_start(NbdClientSession *s,
> struct nbd_request *request)
>  {
> -int i;
> -
>  /* Poor man semaphore.  The free_sema is locked when no other request
>   * can be accepted, and unlocked after receiving one reply.  */
>  if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
> @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
>  }
>  s->in_flight++;
>  
> -for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> -if (s->recv_coroutine[i] == NULL) {
> -s->recv_coroutine[i] = qemu_coroutine_self();
> -break;
> -}
> -}
> -
> -assert(i < MAX_NBD_REQUESTS);
> -request->handle = INDEX_TO_HANDLE(s, i);
> +/* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
>  }
>  
>  static void nbd_coroutine_end(NbdClientSession *s,
> 
> 
> 

-- 
Bin Wu




[Qemu-devel] [PATCH v2] nbd: fix the co_queue multi-adding bug

2015-02-09 Thread Bin Wu
From: Bin Wu 

When we tested the VM migartion between different hosts with NBD
devices, we found if we sent a cancel command after the drive_mirror
was just started, a coroutine re-enter error would occur. The stack
was as follow:

(gdb) bt
00)  0x7fdfc744d885 in raise () from /lib64/libc.so.6
01)  0x7fdfc744ee61 in abort () from /lib64/libc.so.6
02)  0x7fdfca467cc5 in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0)
at qemu-coroutine.c:118
03)  0x7fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedb400) at
qemu-coroutine-lock.c:59
04)  0x7fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8,
to=0x7fdfcaedb400) at qemu-coroutine.c:96
05)  0x7fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0)
at qemu-coroutine.c:123
06)  0x7fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedbdc0) at
qemu-coroutine-lock.c:59
07)  0x7fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8,
to=0x7fdfcaedbdc0) at qemu-coroutine.c:96
08)  0x7fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedbdc0, opaque=0x0)
at qemu-coroutine.c:123
09)  0x7fdfca4a1fa4 in nbd_recv_coroutines_enter_all (s=0x7fdfcaef7dd0) at
block/nbd-client.c:41
10) 0x7fdfca4a1ff9 in nbd_teardown_connection (client=0x7fdfcaef7dd0) at
block/nbd-client.c:50
11) 0x7fdfca4a20f0 in nbd_reply_ready (opaque=0x7fdfcaef7dd0) at
block/nbd-client.c:92
12) 0x7fdfca45ed80 in aio_dispatch (ctx=0x7fdfcae15e90) at aio-posix.c:144
13) 0x7fdfca45ef1b in aio_poll (ctx=0x7fdfcae15e90, blocking=false) at
aio-posix.c:222
14) 0x7fdfca448c34 in aio_ctx_dispatch (source=0x7fdfcae15e90, callback=0x0,
user_data=0x0) at async.c:212
15) 0x7fdfc8f2f69a in g_main_context_dispatch () from
/usr/lib64/libglib-2.0.so.0
16) 0x7fdfca45c391 in glib_pollfds_poll () at main-loop.c:190
17) 0x7fdfca45c489 in os_host_main_loop_wait (timeout=1483677098) at
main-loop.c:235
18) 0x7fdfca45c57b in main_loop_wait (nonblocking=0) at main-loop.c:484
19) 0x7fdfca25f403 in main_loop () at vl.c:2249
20) 0x7fdfca266fc2 in main (argc=42, argv=0x7517d638,
envp=0x7517d790) at vl.c:4814

We find the nbd_recv_coroutines_enter_all function (triggered by a cancel
command or a network connection breaking down) will enter a coroutine which
is waiting for the sending lock. If the lock is still held by another coroutine,
the entering coroutine will be added into the co_queue again. Latter, when the
lock is released, a coroutine re-enter error will occur.

This bug can be fixed simply by delaying the setting of recv_coroutine as
suggested by paolo. After applying this patch, we have tested the cancel
operation in mirror phase looply for more than 5 hous and everything is fine.
Without this patch, a coroutine re-enter error will occur in 5 minutes.

Signed-off-by: Bn Wu 
---
v2: fix the coroutine re-enter bug in NBD code, not in coroutine infrastructure
as suggested by paolo and kevin.
---
 block/nbd-client.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 28bfb62..1b0f7c7 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s,
 QEMUIOVector *qiov, int offset)
 {
 AioContext *aio_context;
-int rc, ret;
+int rc, ret, i;
 
 qemu_co_mutex_lock(&s->send_mutex);
+
+for (i = 0; i < MAX_NBD_REQUESTS; i++) {
+if (s->recv_coroutine[i] == NULL) {
+s->recv_coroutine[i] = qemu_coroutine_self();
+break;
+}
+}
+
+assert(i < MAX_NBD_REQUESTS);
+request->handle = INDEX_TO_HANDLE(s, i);
 s->send_coroutine = qemu_coroutine_self();
+
 aio_context = bdrv_get_aio_context(s->bs);
 aio_set_fd_handler(aio_context, s->sock,
nbd_reply_ready, nbd_restart_write, s);
@@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s,
 static void nbd_coroutine_start(NbdClientSession *s,
struct nbd_request *request)
 {
-int i;
-
 /* Poor man semaphore.  The free_sema is locked when no other request
  * can be accepted, and unlocked after receiving one reply.  */
 if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
@@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
 }
 s->in_flight++;
 
-for (i = 0; i < MAX_NBD_REQUESTS; i++) {
-if (s->recv_coroutine[i] == NULL) {
-s->recv_coroutine[i] = qemu_coroutine_self();
-break;
-}
-}
-
-assert(i < MAX_NBD_REQUESTS);
-request->handle = INDEX_TO_HANDLE(s, i);
+/* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
 }
 
 static void nbd_coroutine_end(NbdClientSession *s,
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH v3] qemu-coroutine: segfault when restarting co_queue

2015-02-10 Thread Bin Wu

On 2015/2/10 18:32, Kevin Wolf wrote:
> Am 10.02.2015 um 06:16 hat Bin Wu geschrieben:
>> From: Bin Wu 
>>
>> We tested VMs migration with their disk images by drive_mirror. With
>> migration, two VMs copyed large files between each other. During the
>> test, a segfault occured. The stack was as follow:
>>
>> 00)  0x7fa5a0c63fc5 in qemu_co_queue_run_restart (co=0x7fa5a1798648) at
>> qemu-coroutine-lock.c:66
>> 01)  0x7fa5a0c63bed in coroutine_swap (from=0x7fa5a178f160,
>> to=0x7fa5a1798648) at qemu-coroutine.c:97
>> 02)  0x7fa5a0c63dbf in qemu_coroutine_yield () at qemu-coroutine.c:140
>> 03)  0x7fa5a0c9e474 in nbd_co_receive_reply (s=0x7fa5a1a3cfd0,
>> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
>> block/nbd-client.c:165
>> 04)  0x7fa5a0c9e8b5 in nbd_co_writev_1 (client=0x7fa5a1a3cfd0,
>> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
>> block/nbd-client.c:262
>> 05)  0x7fa5a0c9e9dd in nbd_client_session_co_writev 
>> (client=0x7fa5a1a3cfd0,
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
>> block/nbd-client.c:296
>> 06)  0x7fa5a0c9dda1 in nbd_co_writev (bs=0x7fa5a198fcb0, 
>> sector_num=8552704,
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
>> 07)  0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198fcb0,
>> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> 08)  0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198fcb0,
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> 09)  0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198fcb0,
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>> 0)) at
>> block.c:3471
>> 10) 0x7fa5a0c51074 in bdrv_co_writev (bs=0x7fa5a198fcb0, 
>> sector_num=8552704,
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
>> 11) 0x7fa5a0c652ec in raw_co_writev (bs=0x7fa5a198c110, 
>> sector_num=8552704,
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
>> 12) 0x7fa5a0c509a4 in bdrv_aligned_pwritev (bs=0x7fa5a198c110,
>> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> 13) 0x7fa5a0c50f3f in bdrv_co_do_pwritev (bs=0x7fa5a198c110,
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> 14) 0x7fa5a0c51007 in bdrv_co_do_writev (bs=0x7fa5a198c110,
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 
>> 0)) at
>> block.c:3471
>> 15) 0x7fa5a0c542b3 in bdrv_co_do_rw (opaque=0x7fa5a17a) at 
>> block.c:4706
>> 16) 0x7fa5a0c64e6e in coroutine_trampoline (i0=-1585909408, i1=32677) at
>> coroutine-ucontext.c:121
>> 17) 0x7fa59dc5aa50 in __correctly_grouped_prefixwc () from 
>> /lib64/libc.so.6
>> 18) 0x in ?? ()
>>
>> After analyzing the stack and reviewing the code, we find the
>> qemu_co_queue_run_restart should not be put in the coroutine_swap function 
>> which
>> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
>> qemu_coroutine_enter needs to restart the co_queue.
>>
>> The error scenario is as follow: coroutine C1 enters C2, C2 yields
>> back to C1, then C1 ternimates and the related coroutine memory
>> becomes invalid. After a while, the C2 coroutine is entered again.
>> At this point, C1 is used as a parameter passed to
>> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
>> accesses an invalid memory and a segfault error ocurrs.
>>
>> The qemu_co_queue_run_restart function re-enters coroutines waiting
>> in the co_queue. However, this function should be only used int the
>> qemu_coroutine_enter context. Only in this context, when the current
>> coroutine gets execution control again(after the execution of
>> qemu_coroutine_switch), we can restart the target coutine because the
>> target coutine has yielded back to the current coroutine or it has
>> terminated.
>>
>> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
>> but we find we can not access the target coroutine if it terminates.
>>
>> Signed-off-by: Bin Wu 
> 
> As I replied for v2, I'll send a series with a simpler fix and some
> cleanup that should result in a nicer design in the end.
> 
> Kevin
> 
> .
> 

all right.

-- 
Bin Wu




[Qemu-devel] [PATCH] mirror: hold aio_context before bdrv_drain

2015-03-31 Thread Bin Wu
From: Bin Wu 

Signed-off-by: Bin Wu 
---
 block/mirror.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 4056164..08372df 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -530,7 +530,9 @@ static void coroutine_fn mirror_run(void *opaque)
  * mirror_populate runs.
  */
 trace_mirror_before_drain(s, cnt);
+aio_context_acquire(bdrv_get_aio_context(bs));
 bdrv_drain(bs);
+aio_context_release(bdrv_get_aio_context(bs));
 cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
 }
 
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH] mirror: hold aio_context before bdrv_drain

2015-04-01 Thread Bin Wu

On 2015/4/1 16:19, Fam Zheng wrote:
> On Wed, 04/01 12:42, Bin Wu wrote:
>> From: Bin Wu 
> 
> What's the issue are you fixing? I think the coroutine already is running in
> the AioContext of bs.
> 
> Fam
> 
In the current implementation of bdrv_drain, it should be placed in a critical
section as suggested in the comments above the function: "Note that unlike
bdrv_drain_all(), the caller must hold the BlockDriverState AioContext".

However, the mirror coroutine starting with mirror_run doesn't do this. I just
found qmp_drive_mirror protects the AioCentext, but it is out of the scope of
the mirror coroutine.

Bin

>>
>> Signed-off-by: Bin Wu 
>> ---
>>  block/mirror.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 4056164..08372df 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -530,7 +530,9 @@ static void coroutine_fn mirror_run(void *opaque)
>>   * mirror_populate runs.
>>   */
>>  trace_mirror_before_drain(s, cnt);
>> +aio_context_acquire(bdrv_get_aio_context(bs));
>>  bdrv_drain(bs);
>> +aio_context_release(bdrv_get_aio_context(bs));
>>  cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
>>  }
>>  
>> -- 
>> 1.7.12.4
>>
>>
>>
> 
> .
> 




Re: [Qemu-devel] [Qemu-block] [PATCH] mirror: hold aio_context before bdrv_drain

2015-04-01 Thread Bin Wu
On 2015/4/1 19:59, Stefan Hajnoczi wrote:
> On Wed, Apr 01, 2015 at 04:49:39PM +0800, Bin Wu wrote:
>>
>> On 2015/4/1 16:19, Fam Zheng wrote:
>>> On Wed, 04/01 12:42, Bin Wu wrote:
>>>> From: Bin Wu 
>>>
>>> What's the issue are you fixing? I think the coroutine already is running in
>>> the AioContext of bs.
>>>
>>> Fam
>>>
>> In the current implementation of bdrv_drain, it should be placed in a 
>> critical
>> section as suggested in the comments above the function: "Note that unlike
>> bdrv_drain_all(), the caller must hold the BlockDriverState AioContext".
>>
>> However, the mirror coroutine starting with mirror_run doesn't do this. I 
>> just
>> found qmp_drive_mirror protects the AioCentext, but it is out of the scope of
>> the mirror coroutine.
> 
> There are 3 possibilities:
> 
> 1. qmp_drive_mirror() under QEMU main loop thread.  AioContext is held.
> 
> 2. IOThread aio_poll().  AioContext is held.
> 
> 3. QEMU main loop thread when IOThread (dataplane) is not used.  Here
>the AioContext is the global qemu_aio_context.  We don't need to
>acquire it explicitly, we're protected by the global mutex.
> 
> If #3 was a problem then virtio-blk.c's bdrv_aio_writev() would also be
> a problem, for example.
> 
> This patch is unnecessary.
> 
> Stefan
> 
OK, I see. Thanks
-- 
Bin Wu




Re: [Qemu-devel] [PATCH] virtio-blk: correctly dirty guest memory

2015-04-02 Thread Bin Wu
++) {
-qemu_iovec_add(qiov, tmp_iov[i].iov_base, tmp_iov[i].iov_len);
-}
+/* use the 'in' field to judge whether the request is
+   a merged request */
+merged_request->in = NULL;
+
+qiov = &merged_request->qiov;
+qemu_iovec_init(qiov, niov);

-for (i = start + 1; i < start + num_reqs; i++) {
+for (i = start; i < start + num_reqs; i++) {
 qemu_iovec_concat(qiov, &mrb->reqs[i]->qiov, 0,
   mrb->reqs[i]->qiov.size);
-mrb->reqs[i - 1]->mr_next = mrb->reqs[i];
+if (i > start) {
+mrb->reqs[i - 1]->mr_next = mrb->reqs[i];
+} else {
+merged_request->mr_next = mrb->reqs[i];
+}
+
 nb_sectors += mrb->reqs[i]->qiov.size / BDRV_SECTOR_SIZE;
 }
 assert(nb_sectors == qiov->size / BDRV_SECTOR_SIZE);
@@ -345,14 +348,18 @@ static inline void submit_requests(BlockBackend *blk,
MultiReqBuffer *mrb,
 block_acct_merge_done(blk_get_stats(blk),
   is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ,
   num_reqs - 1);
+} else {
+merged_request = mrb->reqs[start];
+qiov = &mrb->reqs[start]->qiov;
+nb_sectors = mrb->reqs[start]->qiov.size / BDRV_SECTOR_SIZE;
 }

 if (is_write) {
 blk_aio_writev(blk, sector_num, qiov, nb_sectors,
-   virtio_blk_rw_complete, mrb->reqs[start]);
+   virtio_blk_rw_complete, merged_request);
 } else {
 blk_aio_readv(blk, sector_num, qiov, nb_sectors,
-  virtio_blk_rw_complete, mrb->reqs[start]);
+  virtio_blk_rw_complete, merged_request);
 }
 }


-- 
Bin Wu




[Qemu-devel] [question] block: do we have any consideration about adding retry support in error handling?

2014-12-24 Thread Bin Wu
Hi,

When IO error happens in physical device, qemu block layer supports error
reporting, error ignoring and error stoping(for example, virtio-blk). Can we
have any way to resend the error IO?

thanks
-- 
Bin Wu




Re: [Qemu-devel] [question] block: do we have any consideration about adding retry support in error handling?

2014-12-24 Thread Bin Wu
On 2014/12/25 10:42, Fam Zheng wrote:
> On Thu, 12/25 09:57, Bin Wu wrote:
>> Hi,
>>
>> When IO error happens in physical device, qemu block layer supports error
>> reporting, error ignoring and error stoping(for example, virtio-blk). Can we
>> have any way to resend the error IO?
> 
> With error stop, the request is retried after resume.
> 
> Fam
> 
> 

Thank you very much, Fam, I see. Another question: I think error stop is not the
default error handling strategy, how can we configure error stop in the VM XML
file? Can you just show me some example? Thanks again:>

-- 
Bin Wu




Re: [Qemu-devel] [question] block: do we have any consideration about adding retry support in error handling?

2014-12-25 Thread Bin Wu
On 2014/12/25 15:19, Fam Zheng wrote:
> On Thu, 12/25 11:46, Bin Wu wrote:
>> On 2014/12/25 10:42, Fam Zheng wrote:
>>> On Thu, 12/25 09:57, Bin Wu wrote:
>>>> Hi,
>>>>
>>>> When IO error happens in physical device, qemu block layer supports error
>>>> reporting, error ignoring and error stoping(for example, virtio-blk). Can 
>>>> we
>>>> have any way to resend the error IO?
>>>
>>> With error stop, the request is retried after resume.
>>>
>>> Fam
>>>
>>>
>>
>> Thank you very much, Fam, I see. Another question: I think error stop is not 
>> the
>> default error handling strategy, how can we configure error stop in the VM 
>> XML
>> file? Can you just show me some example? Thanks again:>
> 
> This is a question for libvirt, look for "error_policy":
> 
> https://libvirt.org/formatdomain.html
> 
> Fam
> 
> 

yes,
...

...

-- 
Bin Wu