https://bugs.dpdk.org/show_bug.cgi?id=786
Bug ID: 786 Summary: dynamic memory model may cause potential DMA silent error Product: DPDK Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: core Assignee: dev@dpdk.org Reporter: changpeng....@intel.com Target Milestone: --- We found that in some very rare situations the vfio dynamic memory model has an issue which may result the DMA engine doesn't put the data to the right IO buffer, here is the tests we do to identify the issue: 1. Start the application and call rte_zmalloc to allocate IO buffers. Hotplug one NVMe drive, then DPDK will register existing memory region to kernel vfio driver via dma_map ioctl, we added one trace before this ioctl: DPDK dma_map vaddr: 0x200000200000, iova: 0x200000200000, size: 0x14200000, ret: 0 2. Then we call rte_free to free some memory buffers, and DPDK will call dma_unmap to vfio driver and release related huge files: DPDK dma_unmap iova: 0x20000a400000, size: 0x0, ret: 0 Here we saw that the return value is 0, which means success, but the unmap size is 0, the kernel vfio driver didn't do the real unmap action, because the IOVA range isn't same with the previous map one. The new DPDK version will print an error for this case now. 3. Then we call rte_zmalloc again, DPDK will create new huge files and remap to the previous virtual address, and then call dma_map to register to kernel vfio driver: DPDK dma_map vaddr: 0x20000a400000, iova: 0x20000a400000, size: 0x400000, ret=-1, errno was set to EEXIST but DPDK will ignore this errno, so rte_zmalloc will return success. Then if the new malloced memory was used as NVMe IO buffer, the DMA engine may move data to the previous pinned pages, because the kernel vfio driver didn't update the memory map, but all the IO stack will not print any warning log. We can use static memory model as a workaround. -- You are receiving this mail because: You are the assignee for the bug.