Re: [dpdk-dev] [PATCH v1 3/4] vhost: improve device ready definition

Matan Azrad Mon, 22 Jun 2020 01:42:03 -0700


From: Maxime Coquelin
> Hi,
> 
> On 6/21/20 8:20 AM, Matan Azrad wrote:
> > Hi Maxime
> >
> > From: Maxime Coquelin:
> >> Hi Matan,
> >>
> >> On 6/19/20 3:11 PM, Matan Azrad wrote:
> >>> Hi Maxime
> >>>
> >>> Thanks for the fast review.
> >>> This is first version, let's review it carefully to be sure it is correct.
> >>> @Xiao Wang, it will be good to hear your idea too.
> >>> We also need to understand the effect on IFC driver/device...
> >>> Just to update that I checked this code with the mlx5 adjustments
> >>> and I
> >> sent in this series.
> >>> It works well with the vDPA example application.
> >>
> >> OK.
> >>
> >>> From: Maxime Coquelin:
> >>>> On 6/18/20 6:28 PM, Matan Azrad wrote:
> >>>>> Some guest drivers may not configure disabled virtio queues.
> >>>>>
> >>>>> In this case, the vhost management never triggers the vDPA device
> >>>>> configuration because it waits to the device to be ready.
> >>>>
> >>>> This is not vDPA-only, even with SW datapath the application's
> >>>> new_device callback never gets called.
> >>>>
> >>> Yes, I wrote it below, I can be more specific here too in the next 
> >>> version.
> >>>
> >>>>> The current ready state means that all the virtio queues should be
> >>>>> configured regardless the enablement status.
> >>>>>
> >>>>> In order to support this case, this patch changes the ready state:
> >>>>> The device is ready when at least 1 queue pair is configured and
> >>>>> enabled.
> >>>>>
> >>>>> So, now, the vDPA driver will be configured when the first queue
> >>>>> pair is configured and enabled.
> >>>>>
> >>>>> Also the queue state operation is change to the next rules:
> >>>>>         1. queue becomes ready (enabled and fully configured) -
> >>>>>                 set_vring_state(enabled).
> >>>>>         2. queue becomes not ready - set_vring_state(disabled).
> >>>>>         3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage
> >>>> was
> >>>>>                 handled - set_vring_state(enabled).
> >>>>>
> >>>>> The parallel operations for the application are adjusted too.
> >>>>>
> >>>>> Signed-off-by: Matan Azrad <ma...@mellanox.com>
> >>>>> ---
> >>>>>  lib/librte_vhost/vhost_user.c | 51
> >>>>> ++++++++++++++++++++++++++++---------------
> >>>>>  1 file changed, 33 insertions(+), 18 deletions(-)
> >>>>>
> >>>>> diff --git a/lib/librte_vhost/vhost_user.c
> >>>>> b/lib/librte_vhost/vhost_user.c index b0849b9..cfd5f27 100644
> >>>>> --- a/lib/librte_vhost/vhost_user.c
> >>>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>>> @@ -1295,7 +1295,7 @@
> >>>>>  {
> >>>>>         bool rings_ok;
> >>>>>
> >>>>> -       if (!vq)
> >>>>> +       if (!vq || !vq->enabled)
> >>>>>                 return false;
> >>>>>
> >>>>>         if (vq_is_packed(dev))
> >>>>> @@ -1309,24 +1309,27 @@
> >>>>>                vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;  }
> >>>>>
> >>>>> +#define VIRTIO_DEV_NUM_VQS_TO_BE_READY 2u
> >>>>> +
> >>>>>  static int
> >>>>>  virtio_is_ready(struct virtio_net *dev)  {
> >>>>>         struct vhost_virtqueue *vq;
> >>>>>         uint32_t i;
> >>>>>
> >>>>> -       if (dev->nr_vring == 0)
> >>>>> +       if (dev->nr_vring < VIRTIO_DEV_NUM_VQS_TO_BE_READY)
> >>>>>                 return 0;
> >>>>>
> >>>>> -       for (i = 0; i < dev->nr_vring; i++) {
> >>>>> +       for (i = 0; i < VIRTIO_DEV_NUM_VQS_TO_BE_READY; i++) {
> >>>>>                 vq = dev->virtqueue[i];
> >>>>>
> >>>>>                 if (!vq_is_ready(dev, vq))
> >>>>>                         return 0;
> >>>>>         }
> >>>>>
> >>>>> -       VHOST_LOG_CONFIG(INFO,
> >>>>> -               "virtio is now ready for processing.\n");
> >>>>> +       if (!(dev->flags & VIRTIO_DEV_READY))
> >>>>> +               VHOST_LOG_CONFIG(INFO,
> >>>>> +                       "virtio is now ready for processing.\n");
> >>>>>         return 1;
> >>>>>  }
> >>>>>
> >>>>> @@ -1970,8 +1973,6 @@ static int vhost_user_set_vring_err(struct
> >>>> virtio_net **pdev __rte_unused,
> >>>>>         struct virtio_net *dev = *pdev;
> >>>>>         int enable = (int)msg->payload.state.num;
> >>>>>         int index = (int)msg->payload.state.index;
> >>>>> -       struct rte_vdpa_device *vdpa_dev;
> >>>>> -       int did = -1;
> >>>>>
> >>>>>         if (validate_msg_fds(msg, 0) != 0)
> >>>>>                 return RTE_VHOST_MSG_RESULT_ERR; @@ -1980,15 +1981,6
> @@ static
> >>>>> int vhost_user_set_vring_err(struct
> >>>> virtio_net **pdev __rte_unused,
> >>>>>                 "set queue enable: %d to qp idx: %d\n",
> >>>>>                 enable, index);
> >>>>>
> >>>>> -       did = dev->vdpa_dev_id;
> >>>>> -       vdpa_dev = rte_vdpa_get_device(did);
> >>>>> -       if (vdpa_dev && vdpa_dev->ops->set_vring_state)
> >>>>> -               vdpa_dev->ops->set_vring_state(dev->vid, index,
> enable);
> >>>>> -
> >>>>> -       if (dev->notify_ops->vring_state_changed)
> >>>>> -               dev->notify_ops->vring_state_changed(dev->vid,
> >>>>> -                               index, enable);
> >>>>> -
> >>>>>         /* On disable, rings have to be stopped being processed. */
> >>>>>         if (!enable && dev->dequeue_zero_copy)
> >>>>>                 drain_zmbuf_list(dev->virtqueue[index]);
> >>>>> @@ -2622,11 +2614,13 @@ typedef int
> >>>> (*vhost_message_handler_t)(struct virtio_net **pdev,
> >>>>>         struct virtio_net *dev;
> >>>>>         struct VhostUserMsg msg;
> >>>>>         struct rte_vdpa_device *vdpa_dev;
> >>>>> +       bool ready[VHOST_MAX_VRING];
> >>>>>         int did = -1;
> >>>>>         int ret;
> >>>>>         int unlock_required = 0;
> >>>>>         bool handled;
> >>>>>         int request;
> >>>>> +       uint32_t i;
> >>>>>
> >>>>>         dev = get_device(vid);
> >>>>>         if (dev == NULL)
> >>>>> @@ -2668,6 +2662,10 @@ typedef int
> >> (*vhost_message_handler_t)(struct
> >>>> virtio_net **pdev,
> >>>>>                 VHOST_LOG_CONFIG(DEBUG, "External request %d\n",
> >>>> request);
> >>>>>         }
> >>>>>
> >>>>> +       /* Save ready status for all the VQs before message handle.
> */
> >>>>> +       for (i = 0; i < VHOST_MAX_VRING; i++)
> >>>>> +               ready[i] = vq_is_ready(dev, dev->virtqueue[i]);
> >>>>> +
> >>>>
> >>>> This big array can be avoided if you save the ready status in the
> >>>> virtqueue once message have been handled.
> >>>
> >>> You mean you prefer to save it in virtqueue structure? Desn't it
> >>> same
> >> memory ?
> >>> In any case I don't think 0x100 is so big 😊
> >>
> >> I mean in the stack.
> >
> > Do you think that 256B is too much for stack?
> >
> >> And one advantage of saving it in the vq structure is for example you
> >> have memory hotplug. The vq is in ready state in the beginning and in
> >> the end, but during the handling the ring host virtual addresses get
> >> changed because of the munmap/mmap and we need to notify the driver
> otherwise it will miss it.
> >
> > Do you mean VHOST_USER_SET_MEM_TABLE call after first configuration?
> >
> > I don't understand what is the issue of saving it in stack here....
> 
> The issue is if you only check ready state only before and after the message
> affecting the ring is handled, it can be ready at both stages, while the rings
> have changed and state change callback should have been called.


But in this version I checked twice, before message handler and after message 
handler, so it should catch any update.

In any case, as I said, I will move the ready memory to the virtiqueue 
structure in order to save the check before the message handler.
 
> Please check the example patch I sent on Friday, it takes care of invalidating
> the ring state and call the state change callback.
> 
> > But one advantage of saving it in virtqueue structure is that the message
> handler should not check the ready state before each message.
> >
> > I will change it in next version.
> >
> >>>
> >>>>>         ret = vhost_user_check_and_alloc_queue_pair(dev, &msg);
> >>>>>         if (ret < 0) {
> >>>>>                 VHOST_LOG_CONFIG(ERR,
> >>>>> @@ -2802,6 +2800,25 @@ typedef int
> >> (*vhost_message_handler_t)(struct
> >>>> virtio_net **pdev,
> >>>>>                 return -1;
> >>>>>         }
> >>>>>
> >>>>> +       did = dev->vdpa_dev_id;
> >>>>> +       vdpa_dev = rte_vdpa_get_device(did);
> >>>>> +       /* Update ready status. */
> >>>>> +       for (i = 0; i < VHOST_MAX_VRING; i++) {
> >>>>> +               bool cur_ready = vq_is_ready(dev, dev-
> >virtqueue[i]);
> >>>>> +
> >>>>> +               if ((cur_ready && request ==
> >>>> VHOST_USER_SET_VRING_ENABLE &&
> >>>>> +                               i == msg.payload.state.index) ||
> >>>>
> >>>> Couldn't we remove above condition? Aren't the callbacks already
> >>>> called in the set_vring_enable handler?
> >>>
> >>> As we agreed in the design discussion:
> >>>
> >>> " 3. Same handling of the requests, except that we won't notify the
> >>> vdpa driver and the application of vring state changes in the
> >>> VHOST_USER_SET_VRING_ENABLE handler."
> >>>
> >>> So, I removed it from the set_vring_enable handler.
> >>
> >> My bad, the patch context where it is removed made to think it was in
> >> vhost_user_set_vring_err(), so I missed it.
> >>
> >> Thinking at it again since last time we discussed it, we have to send
> >> the notification from the handler in the case
> >>
> >>> Now, the ready state doesn't depend only in
> >> VHOST_USER_SET_VRING_ENABLE massage.
> >>>
> >>>>> +                               cur_ready != ready[i]) {
> >>>>> +                       if (vdpa_dev && vdpa_dev->ops-
> >set_vring_state)
> >>>>> +                               vdpa_dev->ops-
> >set_vring_state(dev->vid, i,
> >>>>> +
> >>>>  (int)cur_ready);
> >>>>> +
> >>>>> +                       if (dev->notify_ops->vring_state_changed)
> >>>>> +                               dev->notify_ops-
> >vring_state_changed(dev-
> >>>>> vid,
> >>>>> +                                                       i,
> (int)cur_ready);
> >>>>> +               }
> >>>>> +       }
> >>>>
> >>>> I think we should move this into a dedicated function, which we
> >>>> would call in every message handler that can modify the ready state.
> >>>>
> >>>> Doing so, we would not have to assume the master sent us disable
> >>>> request for the queue before, ans also would have proper
> >>>> synchronization if the request uses reply-ack feature as it could
> >>>> assume the backend is no more processing the ring once reply-ack is
> >> received.
> >>>
> >>> Makes sense to do it before reply-ack and to create dedicated
> >>> function to
> >> it.
> >>>
> >>> Doen't the vDPA conf should be called before reply-ack too to be
> >>> sure
> >> queues are ready before reply?
> >>
> >> I don't think so, because the backend can start processing the ring after.
> >> What we don't want is that the backend continues to process the rings
> >> when the guest asked to stop doing it.
> >
> > But "doing configuration after reply" may cause that the a guest kicks a
> queue while app \ vDPA driver is being configured.
> > It may lead to some order dependencies in configuration....
> I get your point, we can try to move the configuration before the reply.
> 
> But looking at qemu source code, neither SET_VRING_KICK nor
> SET_VRING_CALL nor SET_VRING_ENABLE request for reply-ack, so it won't
> have any effect.
> 
> > In addition, now, the device ready state becomes on only in the same
> > time that a queue becomes on, so we can do the device ready check (for
> new_device \ dev_conf calls) only when a queue becomes ready in the same
> function.
> 
> If you want, we can do try that too.
> 
> >>> If so, we should move also the device ready code below (maybe also
> >>> vdpa
> >> conf) to this function too.
> >>
> >> So I don't think it is needed.
> >>> But maybe call it directly from this function and not from the
> >>> specific
> >> massage handlers is better, something like the
> >> vhost_user_check_and_alloc_queue_pair function style.
> >>>
> >>> What do you think?
> >
> > Any answer here?
> 
> To move the .new_device and .dev_conf callbacks in the same fonction that
> sends the vring change notifications? Yes, we can do that I think.
> 
> >>>
> >>>>>         if (!(dev->flags & VIRTIO_DEV_RUNNING) && virtio_is_ready(dev)) 
> >>>>> {
> >>>>>                 dev->flags |= VIRTIO_DEV_READY;
> >>>>>
> >>>>> @@ -2816,8 +2833,6 @@ typedef int
> >> (*vhost_message_handler_t)(struct
> >>>> virtio_net **pdev,
> >>>>>                 }
> >>>>>         }
> >>>>>
> >>>>> -       did = dev->vdpa_dev_id;
> >>>>> -       vdpa_dev = rte_vdpa_get_device(did);
> >>>>>         if (vdpa_dev && virtio_is_ready(dev) &&
> >>>>>                         !(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)
> >>>> &&
> >>>>>                         msg.request.master ==
> >>>> VHOST_USER_SET_VRING_CALL) {
> >>>>
> >>>> Shouldn't check on SET_VRING_CALL above be removed?
> >>>
> >>> Isn't it is a workaround for something?
> >>>
> >>
> >> Normally, we should no more need it, as state change notification
> >> will be sent if callfd came to change.
> >
> > Ok, will remove it.
> >

Re: [dpdk-dev] [PATCH v1 3/4] vhost: improve device ready definition

Reply via email to