On Mon, Aug 9, 2021 at 3:49 AM Stefano Garzarella <sgarz...@redhat.com> wrote: > > On Thu, Aug 05, 2021 at 12:00:05PM -0700, Jiang Wang . wrote: > >On Tue, Aug 3, 2021 at 11:49 PM Stefano Garzarella <sgarz...@redhat.com> > >wrote: > >> > >> On Wed, Aug 4, 2021 at 8:41 AM Stefano Garzarella <sgarz...@redhat.com> > >> wrote: > >> > > >> > On Tue, Aug 03, 2021 at 11:58:27AM -0700, Jiang Wang . wrote: > >> > >On Wed, Jul 7, 2021 at 10:27 AM Stefano Garzarella > >> > ><sgarz...@redhat.com> wrote: > >> > >> On Wed, Jul 07, 2021 at 09:52:46AM -0700, Jiang Wang . wrote: > >> > >> >On Wed, Jul 7, 2021 at 1:33 AM Stefano Garzarella > >> > >> ><sgarz...@redhat.com> wrote: > >> > >> >> On Tue, Jul 06, 2021 at 10:26:07PM +0000, Jiang Wang wrote: > >> > > >> > [...] > >> > > >> > >> >> >+ > >> > >> >> >+ if (nvqs < 0) > >> > >> >> >+ nvqs = MAX_VQS_WITHOUT_DGRAM; > >> > >> >> >+ > >> > >> >> >+ if (nvqs == MAX_VQS_WITH_DGRAM) { > >> > >> >> >+ vvc->dgram_recv_vq = virtio_add_queue(vdev, > >> > >> >> >VHOST_VSOCK_QUEUE_SIZE, > >> > >> >> >+ > >> > >> >> >vhost_vsock_common_handle_output); > >> > >> >> >+ vvc->dgram_trans_vq = virtio_add_queue(vdev, > >> > >> >> >VHOST_VSOCK_QUEUE_SIZE, > >> > >> >> >+ > >> > >> >> >vhost_vsock_common_handle_output); > >> > >> >> >+ } > >> > >> >> >+ > >> > >> >> > /* The event queue belongs to QEMU */ > >> > >> >> > vvc->event_vq = virtio_add_queue(vdev, > >> > >> >> > VHOST_VSOCK_QUEUE_SIZE, > >> > >> >> > > >> > >> >> > vhost_vsock_common_handle_output); > >> > >> >> > >> > >> >> Did you do a test with a guest that doesn't support datagram with > >> > >> >> QEMU > >> > >> >> and hosts that do? > >> > >> >> > >> > >> >Yes, and it works. > >> > >> > > >> > >> >> I repost my thoughts that I had on v2: > >> > >> >> > >> > >> >> What happen if the guest doesn't support dgram? > >> > >> >> > >> > >> >> I think we should dynamically use the 3rd queue or the 5th > >> > >> >> queue for > >> > >> >> the events at runtime after the guest acked the features. > >> > >> >> > >> > >> >> Maybe better to switch to an array of VirtQueue. > >> > >> >> > >> > >> >I think in current V3, it already dynamically use 3rd or 5th queue > >> > >> >depending > >> > >> >on the feature bit. > >> > >> > >> > >> I'm not sure. IIUC when vhost_vsock_common_realize() is called, we > >> > >> don't > >> > >> know the features acked by the guest, so how can it be dynamic? > >> > >> > >> > >> Here we should know only if the host kernel supports it. > >> > >> > >> > >> Maybe it works, because in QEMU we use the event queue only after a > >> > >> migration to send a reset event, so you can try to migrate a guest to > >> > >> check this path. > >> > >> > >> > >I tried VM migration and didn't see any problems. The migration looks > >> > >fine > >> > >and vsock dgram still works after migration. Is there any more specific > >> > >test > >> > >you want to run to check for this code path? > >> > > > >> > > >> > I meant a migration of a guest from QEMU without this patch to a QEMU > >> > with this patch. Of course in that case testing a socket stream. > >> > > >> > >> Sorry, I meant the opposite. > >> > >> You should try to migrate a guest that does not support dgrams starting > >> from a QEMU with this patch (and kernel that supports dgram, so qemu > >> uses the 5th queue for event), to a QEMU without this patch. > >> > >Got it. I tried what you said and saw errors on the destination qemu. Then > >I moved event queue up to be number 3 (before adding dgram vqs), > >as the following: > > > >+ /* The event queue belongs to QEMU */ > >+ vvc->event_vq = virtio_add_queue(vdev, VHOST_VSOCK_QUEUE_SIZE, > >+ vhost_vsock_common_handle_output); > >+ > > > > nvqs = vhost_vsock_get_max_qps(enable_dgram); > > > >@@ -253,10 +257,6 @@ void vhost_vsock_common_realize(VirtIODevice > >*vdev, const char *name, bool enabl > > > >vhost_vsock_common_handle_output); > > } > > > >- /* The event queue belongs to QEMU */ > >- vvc->event_vq = virtio_add_queue(vdev, VHOST_VSOCK_QUEUE_SIZE, > >- vhost_vsock_common_handle_output); > >- > > > >But I still see following errors on the destination qemu: > >qemu-system-x86_64: Error starting vhost vsock: 14 > > > >Any idea if my above code change is wrong or missing something? > > No, sorry. > Do you have some kernel messages in the host? >
I checked dmesg but did not find any errors. I will debug more. > > > >Take one step back, what should be the host kernel version? With or > >without dgram support? I tried both. The new dest kernel shows the above > >error. > >The old dest kernel shows a msr error probably not related to vsock. > > I think the best is to try the host kernel with DGRAM support, so QEMU > will allocate all the queues. > > > > >To explain the above qemu 14 error, I think the issue is that the > >source host kernel > >supports dgram by always setting the DGRAM feature bit(in my > >implementation). Then the source > >qemu query the source host kernel, and use 5 for event vq. Even if the source > >guest kernel does not support dgram, it currently has no way to tell the > >source > >host or the source qemu. > > Initially I think is better to try the migration on the same host, so we > can exclude other issues. When it works properly, you can try to migrate > to a different host kernel. > Got it. > > > >On the source machine, when we start qemu process, and the guest VM > >is still in BIOS or early boot process ( before vsock is initialized), I > >think > >at this time, the qemu vhost_vsock_common_realize() is already called. > >So qemu can only check if the host kernel supports dgram or not, but has > >no knowledge about the guest kernel. After the guest kernel is fully boot up, > >it can tell qemu or the host if it supports dgram or not ( or the host or > >qemu > >detect for that). But I don't think we will change the vq numbers at that > >time. > > > >Maybe we should fix that too and change vq numbers ( delete vq and add > >back?) at a later > >time after the guest kernel is fully boot-up? > > IIRC vhost-net allocates the maximum number of queues, and then it uses > only the queues supported/requested, so I think we can do something > similar. > > Allocates 5 queues and, at runtime, we can decide which queue to use. > I see. Will dig more. thanks. > Thanks, > Stefano >