On 02-Feb-20 8:37 PM, Dmitry Kozliuk wrote:
Hi everyone!
Hi,
Primary topics to discuss: 1. Memory management (@Anatoly) 1.1. MM changed radically since v18.08 and dpdk-next-windows does not implement it properly anyway, it allocates segment lists in a PCI bus driver. My implementation closely follows the Linux one using VirtualAlloc2() with XXX_PLACEHOLDER flags to reserve and commit memory, but does not map hugepages to files. Is there a consensus on MM approach in Windows? Anyway, I think EAL private MM API would have to be changed, because memory reservation, allocation, and mapping are completely different operations. Hiding this with an mmap() shim doesn't look right, because mmap()'s behavior differs even among Unix platforms. 1.2. In Windows, there is no /dev/mem to implement rte_virt2iova(), so a simple kernel driver is required for mapping. Moreover, Windows kernel abstracts IOMMU, so those physical addresses may be unsuitable for DMA at all (see below).
I haven't really been following the Windows port much so i have no idea of how it works for now.
The main reason DPDK memory management works the way it does is because of need to support multiprocess. In order to map memory in all processes, we need that space reserved (otherwise there's no guarantee that the newly mapped memory segment will be mapped in all processes, and it'll cause runtime failure). If it wasn't for that, we could allocate memory arbitrarily and as needed. Windows should either follow this model, or drop secondary support and go its own way - the internals are OS-specific anyway.
If there are changes needed to private memalloc API to support the above, that's completely fine - that's why all of this stuff is internal-only :) As long as public API stays roughly the same, we should be good. Bear in mind that DPDK also supports external memory, you might need to make some allowances for that too.
As for IOMMU - we don't support IOVA as VA addressing on FreeBSD, so if Windows port can only work with IOVA as PA, that's fine too. The question of IOVA mode really boils down to, do we control the DMA addresses (IOVA as VA mode), or does the system (IOVA as PA). I'm not familiar with how IOMMU works on Windows, but as long as it fits into that model and we keep the API, it should also be OK :)
-- Thanks, Anatoly