On Tue, Feb 20, 2024 at 12:26:49PM +0100, Maxime Coquelin wrote:
> 
> 
> On 2/13/24 11:05, Michael S. Tsirkin wrote:
> > On Fri, Jan 26, 2024 at 06:07:37PM +0800, Hao Chen wrote:
> > > I run "dpdk-vdpa" and "qemur-L2" in "qemu-L1".
> > > 
> > > In a nested virtualization environment, "qemu-L2" vhost-user socket sends
> > > a "VHOST_USER_IOTLB_MSG" message to "dpdk-vdpa" and blocks waiting for
> > > "dpdk-vdpa" to process the message.
> > > If "dpdk-vdpa" doesn't complete the processing of the 
> > > "VHOST_USER_IOTLB_MSG"
> > > message and sends a message that needs to be replied in another thread,
> > > such as "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG", "dpdk-vdpa" will also
> > > block and wait for "qemu-L2" to process this message. However, "qemu-L2"
> > > vhost-user's socket is blocking while waiting for a reply from "dpdk-vdpa"
> > > after processing the message "VHOSTr_USER_IOTLB_MSG", and
> > > "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG" will not be processed.
> > > In this case, both "dpdk-vdpa" and "qemu-L2" are blocked on the
> > > vhost read, resulting in a deadlock.
> > > 
> > > You can modify "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG" or
> > > "VHOST_USER_IOTLB_MSG" to "no need reply" to fix this issue.
> > > There are too many messages in dpdk that are similar to
> > > "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG", and I would prefer the latter.
> > > 
> > > Fixes: 24e34754eb78 ("vhost-user: factor out msg head and payload")
> > > 
> > > Signed-off-by: Hao Chen <ch...@yusur.tech>
> > 
> > I would be very worried that IOTLB becomes stale and
> > guest memory is corrupted if we just proceed without waiting.
> > 
> > Maxime what do you think? How would you address the issue?
> 
> I agree with you, this is not possible.
> For example, in case of IOTLB invalidate, the frontend relies on the
> backend reply to ensure it is no more accessing the memory before
> proceeding.
> 
> The reply-ack for VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG request is
> less important, if it fails the host notifications won't work but would
> not risk corruption. Maybe on Qemu side we could fail init if processing
> the request fails, as I think that if negotiated, we can expect it to
> succeed.
> 
> What do you think about this proposal?
> 
> Regards,
> Maxime

Fundamentally, I think that if qemu blocks guest waiting for a rely
that is ok but it really has to process incoming messages meanwhile.
Same should apply to backend I think ...


> > 
> > 
> > > ---
> > >   hw/virtio/vhost-user.c | 10 ++--------
> > >   1 file changed, 2 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > index f214df804b..02caa94b6c 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -2371,20 +2371,14 @@ static int vhost_user_net_set_mtu(struct 
> > > vhost_dev *dev, uint16_t mtu)
> > >   static int vhost_user_send_device_iotlb_msg(struct vhost_dev *dev,
> > >                                               struct vhost_iotlb_msg 
> > > *imsg)
> > >   {
> > > -    int ret;
> > >       VhostUserMsg msg = {
> > >           .hdr.request = VHOST_USER_IOTLB_MSG,
> > >           .hdr.size = sizeof(msg.payload.iotlb),
> > > -        .hdr.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY_MASK,
> > > +        .hdr.flags = VHOST_USER_VERSION,
> > >           .payload.iotlb = *imsg,
> > >       };
> > > -    ret = vhost_user_write(dev, &msg, NULL, 0);
> > > -    if (ret < 0) {
> > > -        return ret;
> > > -    }
> > > -
> > > -    return process_message_reply(dev, &msg);
> > > +    return vhost_user_write(dev, &msg, NULL, 0);
> > >   }
> > > -- 
> > > 2.27.0
> > 


Reply via email to