> -----Original Message----- > From: Burakov, Anatoly > Sent: Monday, September 17, 2018 12:34 PM > To: Stojaczyk, Dariusz <dariusz.stojac...@intel.com>; dev@dpdk.org; > Santosh Shukla <santosh.shu...@caviumnetworks.com>; Hemant Agrawal > <hemant.agra...@nxp.com>; Jerin Jacob > <jerin.ja...@caviumnetworks.com> > Cc: Maxime Coquelin <maxime.coque...@redhat.com>; Chas Williams > <ch...@att.com> > Subject: Re: [PATCH v2] eal/bus: use RTE_IOVA_PA only if phys addresses > are available > > On 07-Sep-18 4:58 PM, Darek Stojaczyk wrote: > > When neither RTE_IOVA_VA nor RTE_IOVA_PA was explicitly requested, > > DPDK would currently fallback to the default RTE_IOVA_PA mode and > > possibly encounter a failure later on if running as a non-priviledged > > user. Attempting to use RTE_IOVA_VA if no phys addresses are available > > may help in this case. > > > > Signed-off-by: Darek Stojaczyk <dariusz.stojac...@intel.com> > > --- > > Changes since v1: > > * added a missing rte_memory.h include > > > > lib/librte_eal/common/eal_common_bus.c | 19 +++++++++++++++---- > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/lib/librte_eal/common/eal_common_bus.c > > b/lib/librte_eal/common/eal_common_bus.c > > index 0943851cc..68c581b8a 100644 > > --- a/lib/librte_eal/common/eal_common_bus.c > > +++ b/lib/librte_eal/common/eal_common_bus.c > > @@ -37,6 +37,7 @@ > > #include <rte_bus.h> > > #include <rte_debug.h> > > #include <rte_string_fns.h> > > +#include <rte_memory.h> > > > > #include "eal_private.h" > > > > @@ -236,9 +237,19 @@ rte_bus_get_iommu_class(void) > > mode |= bus->get_iommu_class(); > > } > > > > - if (mode != RTE_IOVA_VA) { > > - /* Use default IOVA mode */ > > - mode = RTE_IOVA_PA; > > + if (mode == RTE_IOVA_VA) > > + return RTE_IOVA_VA; > > + > > + if (mode & RTE_IOVA_PA) { > > + /* Not all buses support RTE_IOVA_VA, fallback to > RTE_IOVA_PA */ > > + return RTE_IOVA_PA; > > + } > > + > > + if (rte_eal_using_phys_addrs()) { > > + /* Default to RTE_IOVA_PA only if it's supported */ > > + return RTE_IOVA_PA; > > } > > - return mode; > > + > > + /* Since RTE_IOVA_PA is unsupported, fallback to RTE_IOVA_VA */ > > + return RTE_IOVA_VA; > > } > > > > This is a good change, however I think that this is too pessimistic. If i > don't > have any devices that explictly require IOVA_PA, i should be running in > IOVA_VA mode.
Another problem may occur when trying to hotplug devices that support only 39bit DMA. You may not be able to map any memory with vfio when in RTE_IOVA_VA mode, as virtual addresses likely occupy more than 39 bits. The rte_pci bus enforces RTE_IOVA_PA whenever it finds such devices on init. I have no doubt the logic can be improved here, but for now RTE_IOVA_PA is the only safe default. D. > > This of course doesn't take hotplug into account, so a command-line switch > to force one or the other should also be available. > > For example, at startup, i might have devices bound to VFIO, so IOVA_VA > mode is picked. However, even though at a time of startup none of the > devices require physical addresses, i also know that i might later hotplug a > device that requires IOVA_PA (leaving the question of hotplug brokenness > aside for now...) - currently, this scenario will not work, as i will be > forced to > use IOVA_VA mode unless i happen to have a IOVA_PA device available at > startup. > > Similarly, if i'm running DPDK as root but am only using virtual devices like > pcap, i should be able to force DPDK into using VA addresses [*], yet > currently i will be forced to use IOVA_PA if i don't *also* have a few devices > bound exclusively to VFIO. > > [*] Do we have vdev devices that require IOVA_PA? I can't think of any... > > -- > Thanks, > Anatoly