On 5/13/2021 3:38 PM, Kevin Traynor wrote: > On 13/05/2021 15:11, David Marchand wrote: >> On Thu, May 13, 2021 at 2:38 PM Chenbo Xia <chenbo....@intel.com> wrote: >>> >>> This patch fixes an issue of application crash because of vhost iotlb >>> not initialized when virtio has multiqueue enabled. >>> >>> iotlb messages will be sent when some queues are not enabled. If we >>> initialize iotlb in vhost_user_set_vring_num, it could happen that >>> iotlb update comes when iotlb pool of disabled queues are not >>> initialized. >> >> This makes the problem I reproduced disappear at init, but I noticed >> the segfault after restarting testpmd once. >> And a little bit after this, my vm crashed. >> >> This is not systematic, so I guess there is some condition with how >> the virtio device is initialised in the vm. >> > > Ok, no point in Red Hat QA testing RC3 yet, if it is still faulty. > > fyi - if you want to fix with a new patch it will likely delay Red Hat > QA testing RC3 (maybe others?) and probably they will only have cycles > for one RC3 test run. > > If you choose to revert, we can ask Red Hat QA to test RC3 without > further delay. Please let us know when you consider the options. >
If the patch is not good to go as it is I suggest reverting it, as far as I know Chenbo will be off for Friday & Monday, so it doesn't leave much time to update/test a new version. >> >> One question below. >> >> >> Bugzilla ID: 703 >> >>> Fixes: 968bbc7e2e50 ("vhost: avoid IOTLB mempool allocation while IOMMU >>> disabled") >>> >> >> Reported-by: Pei Zhang <pezh...@redhat.com> >> >>> Signed-off-by: Chenbo Xia <chenbo....@intel.com> >>> --- >>> lib/vhost/vhost_user.c | 13 +++++++++---- >>> 1 file changed, 9 insertions(+), 4 deletions(-) >>> >>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c >>> index 611ff209e3..ae4df8eb69 100644 >>> --- a/lib/vhost/vhost_user.c >>> +++ b/lib/vhost/vhost_user.c >>> @@ -311,6 +311,7 @@ vhost_user_set_features(struct virtio_net **pdev, >>> struct VhostUserMsg *msg, >>> uint64_t features = msg->payload.u64; >>> uint64_t vhost_features = 0; >>> struct rte_vdpa_device *vdpa_dev; >>> + uint32_t i; >>> >>> if (validate_msg_fds(msg, 0) != 0) >>> return RTE_VHOST_MSG_RESULT_ERR; >>> @@ -389,6 +390,14 @@ vhost_user_set_features(struct virtio_net **pdev, >>> struct VhostUserMsg *msg, >>> vdpa_dev->ops->set_features(dev->vid); >>> >>> dev->flags &= ~VIRTIO_DEV_FEATURES_FAILED; >>> + >>> + if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) { >>> + for (i = 0; i < dev->nr_vring; i++) { >> >> I don't know the vhost-user protocol. >> At this point of the device init/life, are we sure nr_vring is set to >> the max number of vring? >> The logs I have tend to say it is the case, but is there a guarantee >> in the protocol? >> >> >> Another way to fix would be to allocate on the first >> VHOST_USER_IOTLB_MSG message received for a vring. >> >> >>> + if (vhost_user_iotlb_init(dev, i)) >>> + return RTE_VHOST_MSG_RESULT_ERR; >>> + } >>> + } >>> + >>> return RTE_VHOST_MSG_RESULT_OK; >>> } >>> >> >> >