On Thu, Jan 25, 2018 at 04:06:53PM +0100, Maxime Coquelin wrote: > In the unlikely case the IOTLB memory pool runs out of memory, > an issue may happen if all entries are used by the IOTLB cache, > and an IOTLB miss happen. If the iotlb pending list is empty, > then no memory is freed and allocation fails a second time. > > This patch fixes this by doing an IOTLB cache random evict if > the IOTLB pending list is empty, ensuring the second allocation > try will succeed. > > In the same spirit, the opposite is done when inserting an > IOTLB entry in the IOTLB cache fails due to out of memory. In > this case, the IOTLB pending is flushed if the IOTLB cache is > empty to ensure the new entry can be inserted. > > Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions") > Fixes: f72c2ad63aeb ("vhost: add pending IOTLB miss request list and helpers") > [...] > @@ -95,9 +98,11 @@ vhost_user_iotlb_pending_insert(struct vhost_virtqueue *vq, > > ret = rte_mempool_get(vq->iotlb_pool, (void **)&node); > if (ret) { > - RTE_LOG(INFO, VHOST_CONFIG, > - "IOTLB pool empty, clear pending misses\n"); > - vhost_user_iotlb_pending_remove_all(vq); > + RTE_LOG(INFO, VHOST_CONFIG, "IOTLB pool empty, clear > entries\n"); > + if (!TAILQ_EMPTY(&vq->iotlb_pending_list)) > + vhost_user_iotlb_pending_remove_all(vq); > + else > + vhost_user_iotlb_cache_random_evict(vq); > ret = rte_mempool_get(vq->iotlb_pool, (void **)&node); > if (ret) { > RTE_LOG(ERR, VHOST_CONFIG, "IOTLB pool still empty, > failure\n"); > @@ -186,8 +191,11 @@ vhost_user_iotlb_cache_insert(struct vhost_virtqueue > *vq, uint64_t iova, > > ret = rte_mempool_get(vq->iotlb_pool, (void **)&new_node); > if (ret) { > - RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, evict one > entry\n"); > - vhost_user_iotlb_cache_random_evict(vq); > + RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, clear > entries\n");
Maybe we should use the same log level for both cases? Currently, the first one is INFO, and this one is DEBUG. > + if (!TAILQ_EMPTY(&vq->iotlb_list)) > + vhost_user_iotlb_cache_random_evict(vq); > + else > + vhost_user_iotlb_pending_remove_all(vq); The IOTLB entry will be invalidated explicitly if it's unmapped in the frontend. So all the IOTLB entries in IOTLB cache are supposed to be valid. So I think maybe it's better to always prefer to free memory from IOTLB pending list. Something like: if (TAILQ_EMPTY(&vq->iotlb_list)) vhost_user_iotlb_pending_remove_all(vq); else if (TAILQ_EMPTY(&vq->iotlb_pending_list)) vhost_user_iotlb_cache_random_evict(vq); else vhost_user_iotlb_pending_random_evict(vq); Besides, in __vhost_iova_to_vva(): uint64_t __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t iova, uint64_t size, uint8_t perm) { uint64_t vva, tmp_size; if (unlikely(!size)) return 0; tmp_size = size; vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm); if (tmp_size == size) return vva; if (!vhost_user_iotlb_pending_miss(vq, iova + tmp_size, perm)) { /* * iotlb_lock is read-locked for a full burst, * but it only protects the iotlb cache. * In case of IOTLB miss, we might block on the socket, * which could cause a deadlock with QEMU if an IOTLB update * is being handled. We can safely unlock here to avoid it. */ vhost_user_iotlb_rd_unlock(vq); vhost_user_iotlb_pending_insert(vq, iova + tmp_size, perm); vhost_user_iotlb_miss(dev, iova + tmp_size, perm); The vhost_user_iotlb_miss() may fail. If it fails, the inserted pending entry should be removed from the pending list. Best regards, Tiwei Bie