2017-11-15 23:05 GMT+08:00 Jason Wang <jasow...@redhat.com>: > > > On 2017年11月15日 22:55, Longpeng(Mike) wrote: >> >> Hi guys, >> >> We got a BUG report from our testers yesterday, the testing scenario was >> migrating a VM (Windows guest, *4 vcpus*, 4GB, vhost-user net: *7 >> queues*). >> >> We found the cause reason, and we'll report the BUG or send a fix patch >> to upstream if necessary( we haven't test the upstream yet, sorry... ). > > > Could you explain this a little bit more? > >> >> We want to know why the vhost_net_start() must start *total queues* ( in >> our >> VM there're 7 queues ) but not *the queues that current used* ( in our VM, >> guest >> only uses the first 4 queues because it's limited by the number of vcpus) >> ? >> >> Looking forward to your help, thx :) > > > Since the codes have been there for years and works well for kernel > datapath. You should really explain what's wrong. >
OK. :) In our scenario, the Windows's virtio-net driver only use the first 4 queues and it *only set desc/avail/used table for the first 4 queues*, so in QEMU the desc/avail/ used of the last 3 queues are ZERO, but unfortunately... ''' vhost_net_start for (i = 0; i < total_queues; i++) vhost_net_start_one vhost_dev_start vhost_virtqueue_start ''' In vhost_virtqueue_start(), it will calculate the HVA of desc/avail/used table, so for last 3 queues, it will use ZERO as the GPA to calculate the HVA, and then send the results to the user-mode backend ( we use *vhost-user* ) by vhost_virtqueue_set_addr(). When the EVS get these address, it will update a *idx* which will be treated as vq's last_avail_idx when virtio-net stop ( pls see vhost_virtqueue_stop() ). So we get the following result after virtio-net stop: the desc/avail/used of the last 3 queues's vqs are all ZERO, but these vqs's last_avail_idx is NOT ZERO. At last, virtio_load() reports an error: ''' if (!vdev->vq[i].vring.desc && vdev->vq[i].last_avail_idx) { // <-- will be TRUE error_report("VQ %d address 0x0 " "inconsistent with Host index 0x%x", i, vdev->vq[i].last_avail_idx); return -1; } ''' BTW, the problem won't appear if use Linux guest, because the Linux virtio-net driver will set all 7 queues's desc/avail/used tables. And the problem won't appear if the VM use vhost-net, because vhost-net won't update *idx* in SET_ADDR ioctl. Sorry for my pool English, Maybe I could describe the problem in Chinese for you in private if necessary. > Thanks -- Regards, Longpeng