On 2017年11月16日 13:53, Longpeng (Mike) wrote:
On 2017/11/15 23:54, Longpeng(Mike) wrote:
2017-11-15 23:05 GMT+08:00 Jason Wang<jasow...@redhat.com>:
On 2017年11月15日 22:55, Longpeng(Mike) wrote:
Hi guys,
We got a BUG report from our testers yesterday, the testing scenario was
migrating a VM (Windows guest, *4 vcpus*, 4GB, vhost-user net: *7
queues*).
We found the cause reason, and we'll report the BUG or send a fix patch
to upstream if necessary( we haven't test the upstream yet, sorry... ).
Could you explain this a little bit more?
We want to know why the vhost_net_start() must start*total queues* ( in
our
VM there're 7 queues ) but not*the queues that current used* ( in our VM,
guest
only uses the first 4 queues because it's limited by the number of vcpus)
?
Looking forward to your help, thx:)
Since the codes have been there for years and works well for kernel
datapath. You should really explain what's wrong.
OK.:)
In our scenario, the Windows's virtio-net driver only use the first 4
queues and it
*only set desc/avail/used table for the first 4 queues*, so in QEMU
the desc/avail/
used of the last 3 queues are ZERO, but unfortunately...
'''
vhost_net_start
for (i = 0; i < total_queues; i++)
vhost_net_start_one
vhost_dev_start
vhost_virtqueue_start
'''
In vhost_virtqueue_start(), it will calculate the HVA of
desc/avail/used table, so for last
3 queues, it will use ZERO as the GPA to calculate the HVA, and then
send the results
to the user-mode backend ( we use*vhost-user* ) by vhost_virtqueue_set_addr().
When the EVS get these address, it will update a*idx* which will be
treated as vq's
last_avail_idx when virtio-net stop ( pls see vhost_virtqueue_stop() ).
So we get the following result after virtio-net stop:
the desc/avail/used of the last 3 queues's vqs are all ZERO, but these vqs's
last_avail_idx is NOT ZERO.
At last, virtio_load() reports an error:
'''
if (!vdev->vq[i].vring.desc && vdev->vq[i].last_avail_idx) { // <--
will be TRUE
error_report("VQ %d address 0x0 "
"inconsistent with Host index 0x%x",
i, vdev->vq[i].last_avail_idx);
return -1;
}
'''
BTW, the problem won't appear if use Linux guest, because the Linux virtio-net
driver will set all 7 queues's desc/avail/used tables. And the problem
won't appear
if the VM use vhost-net, because vhost-net won't update*idx* in SET_ADDR ioctl.
Just to make sure I understand here, I thought Windows guest + vhost_net
hit this issue?
Thanks
Sorry for my pool English, Maybe I could describe the problem in Chinese for you
in private if necessary.
Thanks
-- Regards, Longpeng(Mike)