On Tue, Aug 31, 2021 at 4:02 AM Xia, Chenbo <chenbo....@intel.com> wrote: > > Hi Eugenio, > > > -----Original Message----- > > From: Eugenio Perez Martin <epere...@redhat.com> > > Sent: Tuesday, August 31, 2021 2:10 AM > > To: Xia, Chenbo <chenbo....@intel.com> > > Cc: Maxime Coquelin <maxime.coque...@redhat.com>; dev@dpdk.org; Pei Zhang > > <pezh...@redhat.com>; Jason Wang <jasow...@redhat.com> > > Subject: Re: [PATCH] vhost: Clean iotlb cache on vring stop > > > > On Mon, Aug 30, 2021 at 1:58 PM Xia, Chenbo <chenbo....@intel.com> wrote: > > > > > > Hi Eugenio, > > > > > > > Hi Chenbo, > > > > > > -----Original Message----- > > > > From: Eugenio Pérez <epere...@redhat.com> > > > > Sent: Saturday, August 28, 2021 12:13 AM > > > > To: Maxime Coquelin <maxime.coque...@redhat.com>; Xia, Chenbo > > > > <chenbo....@intel.com> > > > > Cc: dev@dpdk.org; Pei Zhang <pezh...@redhat.com>; Jason Wang > > > > <jasow...@redhat.com> > > > > Subject: [PATCH] vhost: Clean iotlb cache on vring stop > > > > > > Clean -> clean > > > > > > > Is that something I need to send a new revision for, or is it ok to > > apply on the maintainer side? > > > > > > > > > > Old IOVA cache entries are left when there is a change on virtio driver > > > > in VM. In case that all these old entries have iova addresses lesser > > > > than new iova entries, vhost code will need to iterate all the cache to > > > > find the new ones. In case of just a new iova entry needed for the new > > > > translations, this condition will last forever. > > > > > > > > This has been observed in virtio-net to testpmd's vfio-pci driver > > > > transition, reducing the performance from more than 10Mpps to less than > > > > 0.07Mpps if the hugepage address was higher than the networking > > > > buffers. Since all new buffers are contained in this new gigantic page, > > > > vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at worst. > > > > > > I'm curious why QEMU will not invalidate iotlb when virtio-net driver is > > removed > > > (dma region should be unmapped). > > > > > > > I'm going to investigate this more, but qemu iommu notifier callback > > (vhost_iommu_unmap_notify) is never called through all the test. Also, > > guest kernel code calls dma_unmap_page for each buffer and vqs, but it > > never generates an iotlb flush. > > > > Or do you mean that qemu should also flush all iotlb entries on vhost > > device stop? > > I think as you said, the driver unmapped all entries, so theoretically the > iotlb > entries should all be invalidated. The iotlb invalidation should be triggered > by > iommu entry unmapping. Make sense to you? >
It totally makes sense. I just wanted to make sure at what level you meant the unmap should happen: If you meant qemu, then we need more code. If you meant the virtio-net/ring kernel driver, then there is a bug/misunderstanding on what is happening. I think the iotlb unmapping is not happening because the entries are still valid, although not needed with vfio/guest's testpmd. If that were not true we would see way more errors with virtio-net+iommu because of invalid translations: * virtio-net adds a buffer to virtio_ring. virtio-ring dma map does not generate a iotlb map: Qemu will wait for each device to request the missing pages. * The device will miss the translation: It requests it, and saves it in the list for the future. * virtio-ring is done with the buffer: It unmaps them. After a while, kernel needs to dma_map that page again. This new mapping does not generate a page miss as far as I see. * Since this new mapping is already on the device's iotlb cache, the device will not issue a page miss. That means the page must be mapped to the same GPA as before: If not, the network would fail. With this assumption, it is useless to iotlb-unmap them: The device will need to request them again if they're cleared from its cache. But all of this is my guess and I could be wrong :). > > > > > And since the perf drop is huge, why not cc to stable and add fix tag? > > > > > > > I was not sure if it was worth it to backport, but I would say that > > the issue can be reproduced with enough bad luck. Since translations > > have always been saved in a linked list: > > > > Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions") > > > > Same question as before, if no changes to the code are needed for the > > patch, do I need to send a second revision? > > I think it sounds a bug to me. So no matter how often it could be reproduced, > it's > worth to backport. I suggest to send a v2 with the title fixed and Fixes & cc > stable tag added so that it will show in stable mailing list and save > Maxime's effort > when applying. > Sure, I will do it. Thanks! > Thanks, > Chenbo > > > > > Thanks! > > > > > > > Thanks, > > > Chenbo > > > > > > > > > > > Signed-off-by: Eugenio Pérez <epere...@redhat.com> > > > > Reported-by: Pei Zhang <pezh...@redhat.com> > > > > --- > > > > lib/vhost/vhost_user.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c > > > > index 29a4c9af60..7de48f5333 100644 > > > > --- a/lib/vhost/vhost_user.c > > > > +++ b/lib/vhost/vhost_user.c > > > > @@ -2113,6 +2113,8 @@ vhost_user_get_vring_base(struct virtio_net > > > > **pdev, > > > > msg->size = sizeof(msg->payload.state); > > > > msg->fd_num = 0; > > > > > > > > + vhost_user_iotlb_flush_all(vq); > > > > + > > > > vring_invalidate(dev, vq); > > > > > > > > return RTE_VHOST_MSG_RESULT_REPLY; > > > > -- > > > > 2.27.0 > > > >