On Thursday 20 April 2017 07:52 PM, Alexey Kardashevskiy wrote:
On 20/04/17 23:25, Alexey Kardashevskiy wrote:
On 20/04/17 19:04, Jonas Pfefferle1 wrote:
Alexey Kardashevskiy <a...@ozlabs.ru> wrote on 20/04/2017 09:24:02:
From: Alexey Kardashevskiy <a...@ozlabs.ru>
To: dev@dpdk.org
Cc: Alexey Kardashevskiy <a...@ozlabs.ru>, j...@zurich.ibm.com,
Gowrishankar Muthukrishnan <gowrishanka...@in.ibm.com>
Date: 20/04/2017 09:24
Subject: [PATCH dpdk 5/5] RFC: vfio/ppc64/spapr: Use correct bus
addresses for DMA map
VFIO_IOMMU_SPAPR_TCE_CREATE ioctl() returns the actual bus address for
just created DMA window. It happens to start from zero because the default
window is removed (leaving no windows) and new window starts from zero.
However this is not guaranteed and the new window may start from another
address, this adds an error check.
Another issue is that IOVA passed to VFIO_IOMMU_MAP_DMA should be a PCI
bus address while in this case a physical address of a user page is used.
This changes IOVA to start from zero in a hope that the rest of DPDK
expects this.
This is not the case. DPDK expects a 1:1 mapping PA==IOVA. It will use the
phys_addr of the memory segment it got from /proc/self/pagemap cf.
librte_eal/linuxapp/eal/eal_memory.c. We could try setting it here to the
actual iova which basically makes the whole virtual to phyiscal mapping
with pagemap unnecessary which I believe should be the case for VFIO
anyway. Pagemap should only be needed when using pci_uio.
Ah, ok, makes sense now. But it sure needs a big fat comment there as it is
not obvious why host RAM address is used there as DMA window start is not
guaranteed.
Well, either way there is some bug - ms[i].phys_addr and ms[i].addr_64 both
have exact same value, in my setup it is 3fffb33c0000 which is a userspace
address - at least ms[i].phys_addr must be physical address.
This patch breaks i40e_dev_init() in my server.
EAL: PCI device 0004:01:00.0 on NUMA socket 1
EAL: probe driver: 8086:1583 net_i40e
EAL: using IOMMU type 7 (sPAPR)
eth_i40e_dev_init(): Failed to init adminq: -32
EAL: Releasing pci mapped resource for 0004:01:00.0
EAL: Calling pci_unmap_resource for 0004:01:00.0 at 0x3fff82aa0000
EAL: Requested device 0004:01:00.0 cannot be used
EAL: PCI device 0004:01:00.1 on NUMA socket 1
EAL: probe driver: 8086:1583 net_i40e
EAL: using IOMMU type 7 (sPAPR)
eth_i40e_dev_init(): Failed to init adminq: -32
EAL: Releasing pci mapped resource for 0004:01:00.1
EAL: Calling pci_unmap_resource for 0004:01:00.1 at 0x3fff82aa0000
EAL: Requested device 0004:01:00.1 cannot be used
EAL: No probed ethernet devices
I have two memseg each of 1G size. Their mapped PA and VA are also
different.
(gdb) p /x ms[0]
$3 = {phys_addr = 0x1e0b000000, {addr = 0x3effaf000000, addr_64 =
0x3effaf000000},
len = 0x40000000, hugepage_sz = 0x1000000, socket_id = 0x1, nchannel
= 0x0, nrank = 0x0}
(gdb) p /x ms[1]
$4 = {phys_addr = 0xf6d000000, {addr = 0x3efbaf000000, addr_64 =
0x3efbaf000000},
len = 0x40000000, hugepage_sz = 0x1000000, socket_id = 0x0, nchannel
= 0x0, nrank = 0x0}
Could you please recheck this. May be, if new DMA window does not start
from bus address 0,
only then you reset dma_map.iova for this offset ?
Thanks,
Gowrishankar
Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/
librte_eal/linuxapp/eal/eal_vfio.c
index 46f951f4d..8b8e75c4f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -658,7 +658,7 @@ vfio_spapr_dma_map(int vfio_container_fd)
{
const struct rte_memseg *ms = rte_eal_get_physmem_layout();
int i, ret;
-
+ phys_addr_t io_offset;
struct vfio_iommu_spapr_register_memory reg = {
.argsz = sizeof(reg),
.flags = 0
@@ -702,6 +702,13 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
+ io_offset = create.start_addr;
+ if (io_offset) {
+ RTE_LOG(ERR, EAL, " DMA offsets other than zero is not supported, "
+ "new window is created at %lx\n", io_offset);
+ return -1;
+ }
+
/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
for (i = 0; i < RTE_MAX_MEMSEG; i++) {
struct vfio_iommu_type1_dma_map dma_map;
@@ -723,7 +730,7 @@ vfio_spapr_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ dma_map.iova = io_offset;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
@@ -735,6 +742,7 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
+ io_offset += dma_map.size;
}
return 0;
--
2.11.0