-----Original Message-----
From: Nipun Gupta <nipun.gu...@amd.com>
Sent: Friday, December 30, 2022 5:59 PM
To: dev@dpdk.org; tho...@monjalon.net; Burakov, Anatoly
<anatoly.bura...@intel.com>; ferruh.yi...@amd.com
Cc: nikhil.agar...@amd.com; Nipun Gupta <nipun.gu...@amd.com>
Subject: [PATCH] vfio: do not coalesce DMA mappings
At the cleanup time when dma unmap is done, linux kernel does not allow
unmap of individual segments which were coalesced together while creating
the DMA map for type1 IOMMU mappings. So, this change updates the
mapping of the memory
segments(hugepages) on a per-page basis.
Signed-off-by: Nipun Gupta <nipun.gu...@amd.com>
---
When hotplug of devices is used, multiple pages gets colaeced and a single
mapping gets created for these pages (using APIs
rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time
when the memory is released, the VFIO does not cleans up that memory and
following error is observed in the eal for 2MB
hugepages:
EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152
This is because VFIO does not clear the DMA (refer API
vfio_dma_do_unmap() -
https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.
c#L1330),
where it checks the dma mapping where it checks for IOVA to free:
https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_type1.
c#L1418.
Thus this change updates the mapping to be created individually instead of
colaecing them.
lib/eal/linux/eal_vfio.c | 29 -----------------------------
1 file changed, 29 deletions(-)
diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index
549b86ae1d..56edccb0db 100644
--- a/lib/eal/linux/eal_vfio.c
+++ b/lib/eal/linux/eal_vfio.c
@@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base,
return 1;
}
-static int
-type1_map_contig(const struct rte_memseg_list *msl, const struct
rte_memseg *ms,
- size_t len, void *arg)
-{
- int *vfio_container_fd = arg;
-
- if (msl->external)
- return 0;
-
- return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64,
ms->iova,
- len, 1);
-}
-
static int
type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
void *arg)
@@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl,
const struct rte_memseg *ms,
if (ms->iova == RTE_BAD_IOVA)
return 0;
- /* if IOVA mode is VA, we've already mapped the internal segments */
- if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA)
- return 0;
-
return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64,
ms->iova,
ms->len, 1);
}
@@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd,
uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int
vfio_container_fd) {
- if (rte_eal_iova_mode() == RTE_IOVA_VA) {
- /* with IOVA as VA mode, we can get away with mapping
contiguous
- * chunks rather than going page-by-page.
- */
- int ret = rte_memseg_contig_walk(type1_map_contig,
- &vfio_container_fd);
- if (ret)
- return ret;
- /* we have to continue the walk because we've skipped the
- * external segments during the config walk.
- */
- }
return rte_memseg_walk(type1_map, &vfio_container_fd); }
--
2.25.1