Hi Matan, > -----Original Message----- > From: Matan Azrad <ma...@mellanox.com> > Sent: Friday, June 12, 2020 7:06 PM > To: Xia, Chenbo <chenbo....@intel.com>; Maxime Coquelin > <maxime.coque...@redhat.com> > Cc: dev@dpdk.org; Wang, Xiao W <xiao.w.w...@intel.com>; Tiwei Bie > <tiwei....@intel.com>; sta...@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] vhost: fix host notifier configuration error > flow > > > Hi Xia > Thanks for the fast review. > Please see inline. > > From: Xia, Chenbo: > > Hi Matan, > > > > > -----Original Message----- > > > From: dev <dev-boun...@dpdk.org> On Behalf Of Matan Azrad > > > Sent: Thursday, June 11, 2020 11:48 PM > > > To: Maxime Coquelin <maxime.coque...@redhat.com> > > > Cc: dev@dpdk.org; Wang, Xiao W <xiao.w.w...@intel.com>; Tiwei Bie > > > <tiwei....@intel.com>; sta...@dpdk.org > > > Subject: [dpdk-dev] [PATCH] vhost: fix host notifier configuration > > > error flow > > > > > > A vDPA driver can configure its device FD to be notified directly by > > > the guest memory mapping using `rte_vhost_host_notifier_ctrl` API. > > > > > > The driver request is managed by the dpdk vhost management and is > > > forwarded to the QEMU, the vhost massage includes reply request in > > > order to be sure that the memory mapping was done correctly by the > > QEMU. > > > > > > When QEMU finishes the configuration, it marks that its replay is > > > valid in the slave FD using VHOST_USER_REPLY_MASK flag. > > > The flag is set only in success and when the slave FD includes the > > > reply > > data. > > > > > > The vhost library didn't validate the above flag before accessing to > > > the slave FD, it leaded to the thread to be blocked on recvmsg call > > > forever in case the QEMU has some problems in the notifier configuration. > > > > > > Handle VHOST_USER_REPLY_MASK flag to validate that slave FD includes > > > a reply data. > > > > > > Fixes: d90cf7d111ac ("vhost: support host notifier") > > > Cc: sta...@dpdk.org > > > > > > Signed-off-by: Matan Azrad <ma...@mellanox.com> > > > --- > > > lib/librte_vhost/vhost_user.c | 10 ++++++++-- > > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > > > diff --git a/lib/librte_vhost/vhost_user.c > > > b/lib/librte_vhost/vhost_user.c index > > > 84bebad..aa19d15 100644 > > > --- a/lib/librte_vhost/vhost_user.c > > > +++ b/lib/librte_vhost/vhost_user.c > > > @@ -2833,8 +2833,14 @@ static int process_slave_message_reply(struct > > > virtio_net *dev, > > > struct VhostUserMsg msg_reply; > > > int ret; > > > > > > - if ((msg->flags & VHOST_USER_NEED_REPLY) == 0) > > > - return 0; > > > + if (!(msg->flags & VHOST_USER_REPLY_MASK)) { > > > + if (msg->flags & VHOST_USER_NEED_REPLY) { > > > + ret = -1; > > > + goto out; > > > + } else { > > > + return 0; > > > + } > > > + } > > > > Based on your commit log, I think you want to check the reply msg sent > > from qemu but msg is the request sent from vhost-user. > > Yes, looks like I missed here something. > > > > > Also, could you clarify the problem based on that? Because I see in > > QEMU > > v5.0.0 that if vhost request has VHOST_USER_NEED_REPLY_MASK, qemu will > > set the reply mask with VHOST_USER_REPLY_MASK and without > > VHOST_USER_NEED_REPLY_MASK no matter the handle is correct. Do I miss > > something? Please correct me. > > I got stuck in this function while receiving massage back from QEMU(4.2.0). > Looks like QEMU printed "Failed to read from slave. ". > In this case looks like QEMU doesn't reply and therefore the dpdk thread was > stuck. > Any idea?
I see in QEMU code that the error is caused by recv size not match, could you check on the slave channel send/recv? I check in dpdk code that normally all the slave msg should have size larger than VHOST_USER_HDR_SIZE. There is one possibility that dpdk close its socket for some reason so that recvmsg returns 0. But I am sure. It's better to check on send/recv of slave chanel :) BRs, Chenbo > > > > > > Thanks! > > Chenbo > > > > > > > > ret = read_vhost_message(dev->slave_req_fd, &msg_reply); > > > if (ret <= 0) { > > > -- > > > 1.8.3.1