From: Maxime Coquelin > Hi, > > On 6/21/20 8:20 AM, Matan Azrad wrote: > > Hi Maxime > > > > From: Maxime Coquelin: > >> Hi Matan, > >> > >> On 6/19/20 3:11 PM, Matan Azrad wrote: > >>> Hi Maxime > >>> > >>> Thanks for the fast review. > >>> This is first version, let's review it carefully to be sure it is correct. > >>> @Xiao Wang, it will be good to hear your idea too. > >>> We also need to understand the effect on IFC driver/device... > >>> Just to update that I checked this code with the mlx5 adjustments > >>> and I > >> sent in this series. > >>> It works well with the vDPA example application. > >> > >> OK. > >> > >>> From: Maxime Coquelin: > >>>> On 6/18/20 6:28 PM, Matan Azrad wrote: > >>>>> Some guest drivers may not configure disabled virtio queues. > >>>>> > >>>>> In this case, the vhost management never triggers the vDPA device > >>>>> configuration because it waits to the device to be ready. > >>>> > >>>> This is not vDPA-only, even with SW datapath the application's > >>>> new_device callback never gets called. > >>>> > >>> Yes, I wrote it below, I can be more specific here too in the next > >>> version. > >>> > >>>>> The current ready state means that all the virtio queues should be > >>>>> configured regardless the enablement status. > >>>>> > >>>>> In order to support this case, this patch changes the ready state: > >>>>> The device is ready when at least 1 queue pair is configured and > >>>>> enabled. > >>>>> > >>>>> So, now, the vDPA driver will be configured when the first queue > >>>>> pair is configured and enabled. > >>>>> > >>>>> Also the queue state operation is change to the next rules: > >>>>> 1. queue becomes ready (enabled and fully configured) - > >>>>> set_vring_state(enabled). > >>>>> 2. queue becomes not ready - set_vring_state(disabled). > >>>>> 3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage > >>>> was > >>>>> handled - set_vring_state(enabled). > >>>>> > >>>>> The parallel operations for the application are adjusted too. > >>>>> > >>>>> Signed-off-by: Matan Azrad <ma...@mellanox.com> > >>>>> --- > >>>>> lib/librte_vhost/vhost_user.c | 51 > >>>>> ++++++++++++++++++++++++++++--------------- > >>>>> 1 file changed, 33 insertions(+), 18 deletions(-) > >>>>> > >>>>> diff --git a/lib/librte_vhost/vhost_user.c > >>>>> b/lib/librte_vhost/vhost_user.c index b0849b9..cfd5f27 100644 > >>>>> --- a/lib/librte_vhost/vhost_user.c > >>>>> +++ b/lib/librte_vhost/vhost_user.c > >>>>> @@ -1295,7 +1295,7 @@ > >>>>> { > >>>>> bool rings_ok; > >>>>> > >>>>> - if (!vq) > >>>>> + if (!vq || !vq->enabled) > >>>>> return false; > >>>>> > >>>>> if (vq_is_packed(dev)) > >>>>> @@ -1309,24 +1309,27 @@ > >>>>> vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD; } > >>>>> > >>>>> +#define VIRTIO_DEV_NUM_VQS_TO_BE_READY 2u > >>>>> + > >>>>> static int > >>>>> virtio_is_ready(struct virtio_net *dev) { > >>>>> struct vhost_virtqueue *vq; > >>>>> uint32_t i; > >>>>> > >>>>> - if (dev->nr_vring == 0) > >>>>> + if (dev->nr_vring < VIRTIO_DEV_NUM_VQS_TO_BE_READY) > >>>>> return 0; > >>>>> > >>>>> - for (i = 0; i < dev->nr_vring; i++) { > >>>>> + for (i = 0; i < VIRTIO_DEV_NUM_VQS_TO_BE_READY; i++) { > >>>>> vq = dev->virtqueue[i]; > >>>>> > >>>>> if (!vq_is_ready(dev, vq)) > >>>>> return 0; > >>>>> } > >>>>> > >>>>> - VHOST_LOG_CONFIG(INFO, > >>>>> - "virtio is now ready for processing.\n"); > >>>>> + if (!(dev->flags & VIRTIO_DEV_READY)) > >>>>> + VHOST_LOG_CONFIG(INFO, > >>>>> + "virtio is now ready for processing.\n"); > >>>>> return 1; > >>>>> } > >>>>> > >>>>> @@ -1970,8 +1973,6 @@ static int vhost_user_set_vring_err(struct > >>>> virtio_net **pdev __rte_unused, > >>>>> struct virtio_net *dev = *pdev; > >>>>> int enable = (int)msg->payload.state.num; > >>>>> int index = (int)msg->payload.state.index; > >>>>> - struct rte_vdpa_device *vdpa_dev; > >>>>> - int did = -1; > >>>>> > >>>>> if (validate_msg_fds(msg, 0) != 0) > >>>>> return RTE_VHOST_MSG_RESULT_ERR; @@ -1980,15 +1981,6 > @@ static > >>>>> int vhost_user_set_vring_err(struct > >>>> virtio_net **pdev __rte_unused, > >>>>> "set queue enable: %d to qp idx: %d\n", > >>>>> enable, index); > >>>>> > >>>>> - did = dev->vdpa_dev_id; > >>>>> - vdpa_dev = rte_vdpa_get_device(did); > >>>>> - if (vdpa_dev && vdpa_dev->ops->set_vring_state) > >>>>> - vdpa_dev->ops->set_vring_state(dev->vid, index, > enable); > >>>>> - > >>>>> - if (dev->notify_ops->vring_state_changed) > >>>>> - dev->notify_ops->vring_state_changed(dev->vid, > >>>>> - index, enable); > >>>>> - > >>>>> /* On disable, rings have to be stopped being processed. */ > >>>>> if (!enable && dev->dequeue_zero_copy) > >>>>> drain_zmbuf_list(dev->virtqueue[index]); > >>>>> @@ -2622,11 +2614,13 @@ typedef int > >>>> (*vhost_message_handler_t)(struct virtio_net **pdev, > >>>>> struct virtio_net *dev; > >>>>> struct VhostUserMsg msg; > >>>>> struct rte_vdpa_device *vdpa_dev; > >>>>> + bool ready[VHOST_MAX_VRING]; > >>>>> int did = -1; > >>>>> int ret; > >>>>> int unlock_required = 0; > >>>>> bool handled; > >>>>> int request; > >>>>> + uint32_t i; > >>>>> > >>>>> dev = get_device(vid); > >>>>> if (dev == NULL) > >>>>> @@ -2668,6 +2662,10 @@ typedef int > >> (*vhost_message_handler_t)(struct > >>>> virtio_net **pdev, > >>>>> VHOST_LOG_CONFIG(DEBUG, "External request %d\n", > >>>> request); > >>>>> } > >>>>> > >>>>> + /* Save ready status for all the VQs before message handle. > */ > >>>>> + for (i = 0; i < VHOST_MAX_VRING; i++) > >>>>> + ready[i] = vq_is_ready(dev, dev->virtqueue[i]); > >>>>> + > >>>> > >>>> This big array can be avoided if you save the ready status in the > >>>> virtqueue once message have been handled. > >>> > >>> You mean you prefer to save it in virtqueue structure? Desn't it > >>> same > >> memory ? > >>> In any case I don't think 0x100 is so big 😊 > >> > >> I mean in the stack. > > > > Do you think that 256B is too much for stack? > > > >> And one advantage of saving it in the vq structure is for example you > >> have memory hotplug. The vq is in ready state in the beginning and in > >> the end, but during the handling the ring host virtual addresses get > >> changed because of the munmap/mmap and we need to notify the driver > otherwise it will miss it. > > > > Do you mean VHOST_USER_SET_MEM_TABLE call after first configuration? > > > > I don't understand what is the issue of saving it in stack here.... > > The issue is if you only check ready state only before and after the message > affecting the ring is handled, it can be ready at both stages, while the rings > have changed and state change callback should have been called.
But in this version I checked twice, before message handler and after message handler, so it should catch any update. In any case, as I said, I will move the ready memory to the virtiqueue structure in order to save the check before the message handler. > Please check the example patch I sent on Friday, it takes care of invalidating > the ring state and call the state change callback. > > > But one advantage of saving it in virtqueue structure is that the message > handler should not check the ready state before each message. > > > > I will change it in next version. > > > >>> > >>>>> ret = vhost_user_check_and_alloc_queue_pair(dev, &msg); > >>>>> if (ret < 0) { > >>>>> VHOST_LOG_CONFIG(ERR, > >>>>> @@ -2802,6 +2800,25 @@ typedef int > >> (*vhost_message_handler_t)(struct > >>>> virtio_net **pdev, > >>>>> return -1; > >>>>> } > >>>>> > >>>>> + did = dev->vdpa_dev_id; > >>>>> + vdpa_dev = rte_vdpa_get_device(did); > >>>>> + /* Update ready status. */ > >>>>> + for (i = 0; i < VHOST_MAX_VRING; i++) { > >>>>> + bool cur_ready = vq_is_ready(dev, dev- > >virtqueue[i]); > >>>>> + > >>>>> + if ((cur_ready && request == > >>>> VHOST_USER_SET_VRING_ENABLE && > >>>>> + i == msg.payload.state.index) || > >>>> > >>>> Couldn't we remove above condition? Aren't the callbacks already > >>>> called in the set_vring_enable handler? > >>> > >>> As we agreed in the design discussion: > >>> > >>> " 3. Same handling of the requests, except that we won't notify the > >>> vdpa driver and the application of vring state changes in the > >>> VHOST_USER_SET_VRING_ENABLE handler." > >>> > >>> So, I removed it from the set_vring_enable handler. > >> > >> My bad, the patch context where it is removed made to think it was in > >> vhost_user_set_vring_err(), so I missed it. > >> > >> Thinking at it again since last time we discussed it, we have to send > >> the notification from the handler in the case > >> > >>> Now, the ready state doesn't depend only in > >> VHOST_USER_SET_VRING_ENABLE massage. > >>> > >>>>> + cur_ready != ready[i]) { > >>>>> + if (vdpa_dev && vdpa_dev->ops- > >set_vring_state) > >>>>> + vdpa_dev->ops- > >set_vring_state(dev->vid, i, > >>>>> + > >>>> (int)cur_ready); > >>>>> + > >>>>> + if (dev->notify_ops->vring_state_changed) > >>>>> + dev->notify_ops- > >vring_state_changed(dev- > >>>>> vid, > >>>>> + i, > (int)cur_ready); > >>>>> + } > >>>>> + } > >>>> > >>>> I think we should move this into a dedicated function, which we > >>>> would call in every message handler that can modify the ready state. > >>>> > >>>> Doing so, we would not have to assume the master sent us disable > >>>> request for the queue before, ans also would have proper > >>>> synchronization if the request uses reply-ack feature as it could > >>>> assume the backend is no more processing the ring once reply-ack is > >> received. > >>> > >>> Makes sense to do it before reply-ack and to create dedicated > >>> function to > >> it. > >>> > >>> Doen't the vDPA conf should be called before reply-ack too to be > >>> sure > >> queues are ready before reply? > >> > >> I don't think so, because the backend can start processing the ring after. > >> What we don't want is that the backend continues to process the rings > >> when the guest asked to stop doing it. > > > > But "doing configuration after reply" may cause that the a guest kicks a > queue while app \ vDPA driver is being configured. > > It may lead to some order dependencies in configuration.... > I get your point, we can try to move the configuration before the reply. > > But looking at qemu source code, neither SET_VRING_KICK nor > SET_VRING_CALL nor SET_VRING_ENABLE request for reply-ack, so it won't > have any effect. > > > In addition, now, the device ready state becomes on only in the same > > time that a queue becomes on, so we can do the device ready check (for > new_device \ dev_conf calls) only when a queue becomes ready in the same > function. > > If you want, we can do try that too. > > >>> If so, we should move also the device ready code below (maybe also > >>> vdpa > >> conf) to this function too. > >> > >> So I don't think it is needed. > >>> But maybe call it directly from this function and not from the > >>> specific > >> massage handlers is better, something like the > >> vhost_user_check_and_alloc_queue_pair function style. > >>> > >>> What do you think? > > > > Any answer here? > > To move the .new_device and .dev_conf callbacks in the same fonction that > sends the vring change notifications? Yes, we can do that I think. > > >>> > >>>>> if (!(dev->flags & VIRTIO_DEV_RUNNING) && virtio_is_ready(dev)) > >>>>> { > >>>>> dev->flags |= VIRTIO_DEV_READY; > >>>>> > >>>>> @@ -2816,8 +2833,6 @@ typedef int > >> (*vhost_message_handler_t)(struct > >>>> virtio_net **pdev, > >>>>> } > >>>>> } > >>>>> > >>>>> - did = dev->vdpa_dev_id; > >>>>> - vdpa_dev = rte_vdpa_get_device(did); > >>>>> if (vdpa_dev && virtio_is_ready(dev) && > >>>>> !(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) > >>>> && > >>>>> msg.request.master == > >>>> VHOST_USER_SET_VRING_CALL) { > >>>> > >>>> Shouldn't check on SET_VRING_CALL above be removed? > >>> > >>> Isn't it is a workaround for something? > >>> > >> > >> Normally, we should no more need it, as state change notification > >> will be sent if callfd came to change. > > > > Ok, will remove it. > >