On 21/04/17 01:15, Jonas Pfefferle1 wrote: > Alexey Kardashevskiy <a...@ozlabs.ru> wrote on 20/04/2017 16:22:01: > >> From: Alexey Kardashevskiy <a...@ozlabs.ru> >> To: Jonas Pfefferle1 <j...@zurich.ibm.com> >> Cc: dev@dpdk.org, Gowrishankar Muthukrishnan >> <gowrishanka...@in.ibm.com>, Adrian Schuepbach <d...@zurich.ibm.com> >> Date: 20/04/2017 16:22 >> Subject: Re: [PATCH dpdk 5/5] RFC: vfio/ppc64/spapr: Use correct bus >> addresses for DMA map >> >> On 20/04/17 23:25, Alexey Kardashevskiy wrote: >> > On 20/04/17 19:04, Jonas Pfefferle1 wrote: >> >> Alexey Kardashevskiy <a...@ozlabs.ru> wrote on 20/04/2017 09:24:02: >> >> >> >>> From: Alexey Kardashevskiy <a...@ozlabs.ru> >> >>> To: dev@dpdk.org >> >>> Cc: Alexey Kardashevskiy <a...@ozlabs.ru>, j...@zurich.ibm.com, >> >>> Gowrishankar Muthukrishnan <gowrishanka...@in.ibm.com> >> >>> Date: 20/04/2017 09:24 >> >>> Subject: [PATCH dpdk 5/5] RFC: vfio/ppc64/spapr: Use correct bus >> >>> addresses for DMA map >> >>> >> >>> VFIO_IOMMU_SPAPR_TCE_CREATE ioctl() returns the actual bus address for >> >>> just created DMA window. It happens to start from zero because the > default >> >>> window is removed (leaving no windows) and new window starts from zero. >> >>> However this is not guaranteed and the new window may start from another >> >>> address, this adds an error check. >> >>> >> >>> Another issue is that IOVA passed to VFIO_IOMMU_MAP_DMA should be a PCI >> >>> bus address while in this case a physical address of a user page is used. >> >>> This changes IOVA to start from zero in a hope that the rest of DPDK >> >>> expects this. >> >> >> >> This is not the case. DPDK expects a 1:1 mapping PA==IOVA. It will use the >> >> phys_addr of the memory segment it got from /proc/self/pagemap cf. >> >> librte_eal/linuxapp/eal/eal_memory.c. We could try setting it here to the >> >> actual iova which basically makes the whole virtual to phyiscal mapping >> >> with pagemap unnecessary which I believe should be the case for VFIO >> >> anyway. Pagemap should only be needed when using pci_uio. >> > >> > >> > Ah, ok, makes sense now. But it sure needs a big fat comment there as it is >> > not obvious why host RAM address is used there as DMA window start is not >> > guaranteed. >> >> Well, either way there is some bug - ms[i].phys_addr and ms[i].addr_64 both >> have exact same value, in my setup it is 3fffb33c0000 which is a userspace >> address - at least ms[i].phys_addr must be physical address. > > This might be the case if you are not using hugetlbfs i.e. passing > "--no-huge" cf. eal_memory.c:980 > > /* hugetlbfs can be disabled */ > if (internal_config.no_hugetlbfs) { > addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); > if (addr == MAP_FAILED) { > RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, > strerror(errno)); > return -1; > } > mcfg->memseg[0].phys_addr = (phys_addr_t)(uintptr_t)addr; > mcfg->memseg[0].addr = addr; > mcfg->memseg[0].hugepage_sz = RTE_PGSIZE_4K; > mcfg->memseg[0].len = internal_config.memory; > mcfg->memseg[0].socket_id = 0; > return 0; > } > > If it fails to get the virt2phys mapping it actually assigns iovas starting > from 0 to the memory segments, cf. set_physaddrs eal_memory.c:263
Right, this is the case here. > >> >> >> > >> > >> >> >> >>> >> >>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> >> >>> --- >> >>> lib/librte_eal/linuxapp/eal/eal_vfio.c | 12 ++++++++++-- >> >>> 1 file changed, 10 insertions(+), 2 deletions(-) >> >>> >> >>> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/ >> >>> librte_eal/linuxapp/eal/eal_vfio.c >> >>> index 46f951f4d..8b8e75c4f 100644 >> >>> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c >> >>> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c >> >>> @@ -658,7 +658,7 @@ vfio_spapr_dma_map(int vfio_container_fd) >> >>> { >> >>> const struct rte_memseg *ms = rte_eal_get_physmem_layout(); >> >>> int i, ret; >> >>> - >> >>> + phys_addr_t io_offset; >> >>> struct vfio_iommu_spapr_register_memory reg = { >> >>> .argsz = sizeof(reg), >> >>> .flags = 0 >> >>> @@ -702,6 +702,13 @@ vfio_spapr_dma_map(int vfio_container_fd) >> >>> return -1; >> >>> } >> >>> >> >>> + io_offset = create.start_addr; >> >>> + if (io_offset) { >> >>> + RTE_LOG(ERR, EAL, " DMA offsets other than zero is not >> supported, " >> >>> + "new window is created at %lx\n", io_offset); >> >>> + return -1; >> >>> + } >> >>> + >> >>> /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ >> >>> for (i = 0; i < RTE_MAX_MEMSEG; i++) { >> >>> struct vfio_iommu_type1_dma_map dma_map; >> >>> @@ -723,7 +730,7 @@ vfio_spapr_dma_map(int vfio_container_fd) >> >>> dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map); >> >>> dma_map.vaddr = ms[i].addr_64; >> >>> dma_map.size = ms[i].len; >> >>> - dma_map.iova = ms[i].phys_addr; >> >>> + dma_map.iova = io_offset; >> >>> dma_map.flags = VFIO_DMA_MAP_FLAG_READ | >> >>> VFIO_DMA_MAP_FLAG_WRITE; >> >>> >> >>> @@ -735,6 +742,7 @@ vfio_spapr_dma_map(int vfio_container_fd) >> >>> return -1; >> >>> } >> >>> >> >>> + io_offset += dma_map.size; >> >>> } >> >>> >> >>> return 0; >> >>> -- >> >>> 2.11.0 >> >>> >> >> >> > >> > >> >> >> -- >> Alexey >> > -- Alexey