On 07-Sep-20 2:31 PM, Vikas Gupta wrote:
Hi Burakov,
-----Original Message-----
From: Burakov, Anatoly [mailto:anatoly.bura...@intel.com]
Sent: Friday, September 04, 2020 7:20 PM
To: Vikas Gupta <vikas.gu...@broadcom.com
<mailto:vikas.gu...@broadcom.com>>; dev@dpdk.org <mailto:dev@dpdk.org>
Cc: Ajit Kumar Khaparde <ajit.khapa...@broadcom.com
<mailto:ajit.khapa...@broadcom.com>>; Vikram Prakash
<vikram.prak...@broadcom.com <mailto:vikram.prak...@broadcom.com>>
Subject: Re: [dpdk-dev] Issue with VFIO/IOMMU
On 03-Sep-20 12:09 PM, Vikas Gupta wrote:
Hi,
I observe an issue with IOVA address returned by api
rte_memzone_reserve_aligned (flags= RTE_MEMZONE_IOVA_CONTIG) used for
queue memory allocation. With high level debugging, I notice that IOVA
address returned in mz->iova is not mapped by VFIO_IOMMU_MAP_DMA so in
turn SMMU exception is seen.
I'm not sure i follow.
How did you determine that to be the case, given that, by your own
admission below, `vfio_type1_dma_mem_map` function is executed several
times?
[Vikas]:
I`ll mention map and unmap as below in explaining through one of the example
map = function vfio_type1_dma_mem_map called with argument do_map = 1
unamp = function type1_dma_mem_map called with argument do_map = 0
What I notice that for some particular address received in mz->iova,
after rte_memzone_reserve_aligned is successfully returned, the map
function (vfio_type1_dma_mem_map do_map =1) was not called prior to
return of function rte_memzone_reserve_aligned.
If the function wasn't called, that most likely means that the memory
region in question is still in use. This happens when, for example, your
memzone is less than one page size long, and there is something else
that's already allocated on that page (such as a subsequent/preceding
call to rte_malloc).
Calling memzone reserve doesn't *necessarily* have to result in a call
to IOVA map - this only happens when the memory allocator determines
that it needs more pages to fulfill the request - it's those pages that
are mapped for IOVA, not the memzone. Similarly, freeing memzones
doesn't *necessarily* result in a call to VFIO unmap - the unmap will
only happen if the allocator determines that these pages can be freed.
So, not calling VFIO (un)map after memzone reserve/free is, in and of
itself, not something that is out of the ordinary and is in fact
expected in certain cases. The mapping granularity is page-based, not
memzone-based, so the map/unmap only happens when new *pages* are
reserved or freed. Not every memory (de)allocation triggers
(de)allocation of new pages.
Below is one of the sequence to understand.
Let’s say there is an address ‘*//**/iova_fail’/*, for which exception
is raised by SMMU while dpdk-test runs with Crypto PMD.
When dpdk-test is run with Crypto test suit I see that for an
address*//**/iova_fail/*several times vfio_type1_dma_mem_map is called
with (do_map = 0/1 with length = 2MB). I believe this happens due to
call for memory allocation/free for buffers/queues. The test runs fine
as long as the map is called before rte_memzone_reserve_aligned returns
and similarly for unmap when same memory is freed. But after several
times with map/unmap for*//**/iova_fail/*, map is NOT called before
rte_memzone_reserve_aligned is retuned though iova_fail was previously
unmapped. Since it’s not mapped, SMMU raises an exception.
If there is a case where VFIO unmap erroneously happens (or doesn't
happen when it should), i would very much like to know, but given the
length of the allocation/mapping is 2MB, this sounds exactly like the
use case i have described above - something else is holding onto that
memory, and repeated memzone reserve/free does not cause map/unmap any more.
I would advise adding a custom mem event callback that simply prints out
any new memory being added/removed, and see if indeed you observe that
the pages are indeed being allocated but not mapped.
I would also advise checking the IOVA address with which you get an
exception, and whether it really is a valid IOVA address *at the time of
the exception* (by checking whether the address belongs to one of the
allocated memory segments - see either memseg_walk or
dump_physmem_layout functions). Since you are running the test multiple
times, a plausible alternative explanation could be stale data from a
previous run causing a DMA into an address that was, at one point,
valid, but no longer is.
Please note issue is not frequently visible and might reproduce after
pmd_crypto_autotest is run multiple timesoverdpdk-test.
If you are not able to follow I`ll try to send the debug printfs for test.
Thanks,
Vikas
*Details for the setup*
Platform: Armv8 (Broadcom Stingray)
DPDK release: DPDK 20.08 <http://fast.dpdk.org/rel/dpdk-20.08.tar.xz>
PMD patch:
https://patches.dpdk.org/project/dpdk/list/?series=&submitter=1907&sta
te=&q=&archive=&delegate=
dpdk-test is launched using below command
*dpdk-test --vdev <cryptopmd_name> -w 0000:00:00.0 --iova-mode pa *
The test suite is launched over dpdk-test application command prompt
using command ‘cryptodev_<cryptopmd_name>_autotest’
The issue is seen when several iterations of above test_suite is
executed which in turn do multiple calls to
rte_memzone_reserve_aligned, rte_mempool_create and rte_memzone_free,
rte_mempool_free.
Function *vfio_type1_dma_mem_map* with map/unmap event is executed
several times during test_suite run.
Any inputs would be helpful.
Thanks,
Vikas
--
Thanks,
Anatoly
--
Thanks,
Anatoly