On Wed, 19 Apr 2023 at 06:45, Hanna Czenczek <hre...@redhat.com> wrote: > > On 14.04.23 17:17, Eugenio Perez Martin wrote: > > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek <hre...@redhat.com> wrote: > > [...] > > >> Basically, what I’m hearing is that I need to implement a different > >> feature that has no practical impact right now, and also fix bugs around > >> it along the way... > >> > > To fix this properly requires iterative device migration in qemu as > > far as I know, instead of using VMStates [1]. This way the state is > > requested to virtiofsd before the device reset. > > > > What does virtiofsd do when the state is totally sent? Does it keep > > processing requests and generating new state or is only a one shot > > that will suspend the daemon? If it is the second I think it still can > > be done in one shot at the end, always indicating "no more state" at > > save_live_pending and sending all the state at > > save_live_complete_precopy. > > This sounds to me as if we should reset all devices during migration, > and I don’t understand that. virtiofsd will not immediately process > requests when the state is sent, because the device is still stopped, > but when it is re-enabled (e.g. because of a failed migration), it will > have retained its state and continue processing requests as if nothing > happened. A reset would break this and other stateful back-ends, as I > think Stefan has mentioned somewhere else. > > It seems to me as if there are devices that need a reset, and so need > suspend+resume around it, but I also think there are back-ends that > don’t, where this would only unnecessarily complicate the back-end > implementation.
Existing vhost-user backends must continue working, so I think having two code paths is (almost) unavoidable. One approach is to add SUSPEND/RESUME to the vhost-user protocol with a corresponding VHOST_USER_PROTOCOL_F_SUSPEND feature bit. vhost-user frontends can identify backends that support SUSPEND/RESUME instead of device reset. Old vhost-user backends will continue to use device reset. I said avoiding two code paths is almost unavoidable. It may be possible to rely on existing VHOST_USER_GET_VRING_BASE's semantics (it stops a single virtqueue) instead of SUSPEND. RESUME is replaced by VHOST_USER_SET_VRING_* and gets the device going again. However, I'm not 100% sure if this will work (even for all existing devices). It would require carefully studying both the spec and various implementations to see if it's viable. There's a chance of losing the performance optimization that VHOST_USER_SET_STATUS provided to DPDK if the device is not reset. In my opinion SUSPEND/RESUME is the cleanest way to do this. Stefan