> -----Original Message----- > From: Maxime Coquelin <maxime.coque...@redhat.com> > Sent: Wednesday, April 14, 2021 6:09 PM > To: Hu, Jiayu <jiayu...@intel.com>; dev@dpdk.org > Cc: Xia, Chenbo <chenbo....@intel.com>; Wang, Yinan > <yinan.w...@intel.com>; Pai G, Sunil <sunil.pa...@intel.com>; Jiang, Cheng1 > <cheng1.ji...@intel.com> > Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register > > > > On 4/14/21 3:40 AM, Hu, Jiayu wrote: > > Hi Maxime, > > > >> -----Original Message----- > >> From: Maxime Coquelin <maxime.coque...@redhat.com> > >> Sent: Tuesday, April 13, 2021 7:33 PM > >> To: Hu, Jiayu <jiayu...@intel.com>; dev@dpdk.org > >> Cc: Xia, Chenbo <chenbo....@intel.com>; Wang, Yinan > >> <yinan.w...@intel.com>; Pai G, Sunil <sunil.pa...@intel.com>; Jiang, > Cheng1 > >> <cheng1.ji...@intel.com> > >> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register > >> > >> > >> > >> On 4/2/21 3:04 PM, Jiayu Hu wrote: > >>> Users can register async copy device in vring_state_changed(), > >>> when vhost queue is enabled. However, a deadlock occurs inside > >>> rte_vhost_async_channel_register(), if > >> VHOST_USER_F_PROTOCOL_FEATURES > >>> is not supported, as vhost_user_msg_handler() takes vq->access_lock > >>> before calling vhost_user_set_vring_kick(). > >>> > >>> This patch avoids async register deadlock by removing calling > >>> vring_state_changed() in vhost_user_set_vring_kick(). It's safe > >>> as vhost_user_msg_handler() will call vring_state_changed() anyway. > >>> > >>> Signed-off-by: Jiayu Hu <jiayu...@intel.com> > >>> --- > >>> lib/librte_vhost/vhost_user.c | 3 --- > >>> 1 file changed, 3 deletions(-) > >>> > >>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c > >>> index 44c0452..8f0eba6 100644 > >>> --- a/lib/librte_vhost/vhost_user.c > >>> +++ b/lib/librte_vhost/vhost_user.c > >>> @@ -1918,9 +1918,6 @@ vhost_user_set_vring_kick(struct virtio_net > >> **pdev, struct VhostUserMsg *msg, > >>> */ > >>> if (!(dev->features & (1ULL << > >> VHOST_USER_F_PROTOCOL_FEATURES))) { > >>> vq->enabled = true; > >>> - if (dev->notify_ops->vring_state_changed) > >>> - dev->notify_ops->vring_state_changed( > >>> - dev->vid, file.index, 1); > >>> } > >>> > >>> if (vq->ready) { > >>> > >> > >> As replied earlier on v1, I agree the call to vring_state_changed here > >> is not needed. But it might not be enough, there are other cases where > >> you could have issues. > > > > vhost_user_notify_queue_state() can be called in three cases: > > 1. when vq ready status changes, vhost_user_msg_handler() calls it to > notify > > backend. But vhost_user_msg_handler() doesn't take lock before calling it. > > So in this case, no deadlock occurs in async register. > > > > 2. if vq->ready is true, vhost_user_set_vring_call() calls it to notify > > backend > > vq is not enabled. Although vhost_user_set_vring_call() is protected by > > lock, > > async register is called only if vq is enabled, so async register will not > > be > called > > in this case. > > > > 3. If vq->ready is true, vhost_user_set_vring_kick() calls it to notify > > backend > > vq is not enabled. Same as #2, async register is called only when vq is > enabled. > > Even if vhost_user_set_vring_kick() is protected by lock, there is no > deadlock in > > async register, as it will not be called in this case. > > > > In summary, I think there is no deadlock issue in async register if we > > can remove calling vring_state_change() in vhost_user_set_vring_kick(). > > > But unregister one could be called in theory no? Otherwise it would look > unbalanced. At least on disabled notification, the app should make sure > the DMA transfers to and from the vring are stopped before it returns > from the callabck. Otherwise it could lead to undefined behavior.
Right, users need to call unregister, but we cannot remove calling vhost_user_notify_queue_state() in case #2 and #3, IMHO. So to avoid deadlock, we recommended users to call async unregister in destroy_device(), instead of on vring disabled notification. Does it make sense to you? > > >> > >> Please add stable and Fixes tag. > > > > Do you suggest to make the patch as a fix for 8639d54563a > > ("vhost: introduce async enqueue registration API")? But the > > thing is that code removed in this patch is not introduced > > by this commit. > > The commit you need to point to is the one introducing the > .vring_state_changed() call. So this patch is still a fix for deadlock on async register? Or it is a fix for unnecessary .vring_state_changed() call? Thanks, Jiayu > > > Thanks, > > Jiayu > >> > >> Thanks, > >> Maxime > >