On 28-Jun-18 10:56 AM, Alejandro Lucero wrote:
On Thu, Jun 28, 2018 at 9:54 AM, Burakov, Anatoly
<anatoly.bura...@intel.com <mailto:anatoly.bura...@intel.com>> wrote:
On 27-Jun-18 5:52 PM, Alejandro Lucero wrote:
On Wed, Jun 27, 2018 at 2:24 PM, Burakov, Anatoly
<anatoly.bura...@intel.com <mailto:anatoly.bura...@intel.com>
<mailto:anatoly.bura...@intel.com
<mailto:anatoly.bura...@intel.com>>> wrote:
On 27-Jun-18 11:13 AM, Alejandro Lucero wrote:
On Wed, Jun 27, 2018 at 9:17 AM, Burakov, Anatoly
<anatoly.bura...@intel.com
<mailto:anatoly.bura...@intel.com>
<mailto:anatoly.bura...@intel.com
<mailto:anatoly.bura...@intel.com>>
<mailto:anatoly.bura...@intel.com
<mailto:anatoly.bura...@intel.com>
<mailto:anatoly.bura...@intel.com
<mailto:anatoly.bura...@intel.com>>>> wrote:
On 26-Jun-18 6:37 PM, Alejandro Lucero wrote:
This RFC tries to handle devices with addressing
limitations.
NFP devices
4000/6000 can just handle addresses with 40
bits implying
problems for handling
physical address when machines have more than
1TB of
memory. But
because how
iovas are configured, which can be equivalent
to physical
addresses or based on
virtual addresses, this can be a more likely
problem.
I tried to solve this some time ago:
https://www.mail-archive.com/dev@dpdk.org/msg45214.html
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html
<https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>>
It was delayed because there was some changes in
progress with
EAL device
handling, and, being honest, I completely
forgot about this
until now, when
I have had to work on supporting NFP devices
with DPDK and
non-root users.
I was working on a patch for being applied on
main DPDK
branch
upstream, but
because changes to memory initialization
during the
last months,
this can not
be backported to stable versions, at least the
part
where the
hugepages iovas
are checked.
I realize stable versions only allow bug
fixing, and this
patchset could
arguably not be considered as so. But without
this, it
could be,
although
unlikely, a DPDK used in a machine with more
than 1TB,
and then
NFP using
the wrong DMA host addresses.
Although virtual addresses used as iovas are more
dangerous, for
DPDK versions
before 18.05 this is not worse than with physical
addresses,
because iovas,
when physical addresses are not available, are
based on a
starting address set
to 0x0.
You might want to look at the following patch:
http://patches.dpdk.org/patch/37149/
<http://patches.dpdk.org/patch/37149/>
<http://patches.dpdk.org/patch/37149/
<http://patches.dpdk.org/patch/37149/>>
<http://patches.dpdk.org/patch/37149/
<http://patches.dpdk.org/patch/37149/>
<http://patches.dpdk.org/patch/37149/
<http://patches.dpdk.org/patch/37149/>>>
Since this patch, IOVA as VA mode uses VA
addresses, and
that has
been backported to earlier releases. I don't think
there's
any case
where we used zero-based addresses any more.
But memsegs get the iova based on hugepages physaddr,
and for VA
mode that is based on 0x0 as starting point.
And as far as I know, memsegs iovas are what end up
being used
for IOMMU mappings and what devices will use.
For when physaddrs are available, IOVA as PA mode assigns IOVA
addresses to PA, while IOVA as VA mode assigns IOVA
addresses to VA
(both 18.05+ and pre-18.05 as per above patch, which was
applied to
pre-18.05 stable releases).
When physaddrs aren't available, IOVA as VA mode assigns IOVA
addresses to VA, both 18.05+ and pre-18.05, as per above patch.
This is right.
If physaddrs aren't available and IOVA as PA mode is used,
then i as
far as i can remember, even though technically memsegs get
their
addresses set to 0x0 onwards, the actual addresses we get in
memzones etc. are RTE_BAD_IOVA.
This is not right. Not sure if this was the intention, but if PA
mode and physaddrs not available, this code inside
vfio_type1_dma_map:
if(rte_eal_iova_mode() == RTE_IOVA_VA)
dma_map.iova = dma_map.vaddr;
else
dma_map.iova = ms[i].iova;
does the IOMMU mapping using the iovas and not the vaddr, with
the iovas starting at 0x0.
Yep, you're right, apologies. I confused this with no-huge option.
So, what do you think about the patchset? Could it be this applied to
stable versions?
I'll send a patch for current 18.05 code which will have the dma mask
and the hugepage check, along with changes for doing the mmaps below the
dma mask limit.
I've looked through the code, it looks OK to me (bar some things like
missing .map file additions and a gratuitous rte_panic :) ).
There was a patch/discussion not too long ago about DMA masks for some
IOMMU's - perhaps we can also extend this approach to that?
https://patches.dpdk.org/patch/33192/
--
Thanks,
Anatoly
--
Thanks,
Anatoly