Hi Nelio, > -----Original Message----- > From: Nélio Laranjeiro [mailto:nelio.laranje...@6wind.com] > Sent: Monday, January 22, 2018 10:53 PM > To: Xueming(Steven) Li <xuemi...@mellanox.com> > Cc: Shahaf Shuler <shah...@mellanox.com>; dev@dpdk.org > Subject: Re: [PATCH] net/mlx5: remmap UAR address for multiple process > > Hi Xueming, > > On Fri, Jan 19, 2018 at 11:08:54PM +0800, Xueming Li wrote: > > UAR(doorbell) is hw resources that have to be same address between > > primary and secondary process, failed to mmap UAR will make TX packets > > invisible to HW. > > Today, UAR address returned from verbs api is mixed in heap and loaded > > library address space, prone to be occupied in secondary process. > > This patch reserves a dedicate UAR address space, both primary and > > secondary process re-mmap UAR pages into this space. > > Below is a brief picture of dpdk app address space allocation: > > Before This patch > > ------ ---------- > > [stack] [stack] > > [.so, uar, heap] [.so, heap] > > [(empty)] [(empty)] > > [hugepage] [hugepage] > > [? others] [? others] > > [(empty)] [(empty)] > > [uar] > > [(empty)] > > To minimize conflicts, UAR address space comes after hugepage space > > with an offset to skip potential usage from other drivers. > > Seems it is not the case when the memory is contiguous, according to what > I see in my testpmd /proc/<pid>/maps: > > PMD: mlx5.c:523: mlx5_uar_init_primary(): Reserved UAR address space: > 0x0x7f4da5800000 > > And the fist huge page is at address 0x7f4fa5800000, new UAR space is > before and not after. > > With this patch I still have the situation described as "before". >
Your observation is correct, system is allocating address in a high-to-low manner like stack. UAR address range 0x0x7f4da5800000 - 0x0x7f4ea5800000, 4GB size, With another 4G offset, hugepage range start is 0x7f4fa5800000. > > Once UAR space reserved successfully, UAR pages are re-mmapped into > > new area to keep UAR address aligned between primary and secondary > process. > > > > Signed-off-by: Xueming Li <xuemi...@mellanox.com> > > --- > > drivers/net/mlx5/mlx5.c | 107 > ++++++++++++++++++++++++++++++++++++++++ > > drivers/net/mlx5/mlx5.h | 1 + > > drivers/net/mlx5/mlx5_defs.h | 10 ++++ > > drivers/net/mlx5/mlx5_rxtx.h | 3 +- > > drivers/net/mlx5/mlx5_trigger.c | 7 ++- > > drivers/net/mlx5/mlx5_txq.c | 51 +++++++++++++------ > > 6 files changed, 163 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index > > fc2d59fee..1539ef608 100644 > > --- a/drivers/net/mlx5/mlx5.c > > +++ b/drivers/net/mlx5/mlx5.c > > @@ -39,6 +39,7 @@ > > #include <stdlib.h> > > #include <errno.h> > > #include <net/if.h> > > +#include <sys/mman.h> > > > > /* Verbs header. */ > > /* ISO C doesn't support unnamed structs/unions, disabling -pedantic. > > */ @@ -56,6 +57,7 @@ #include <rte_pci.h> #include <rte_bus_pci.h> > > #include <rte_common.h> > > +#include <rte_eal_memconfig.h> > > #include <rte_kvargs.h> > > > > #include "mlx5.h" > > @@ -466,6 +468,101 @@ mlx5_args(struct mlx5_dev_config *config, struct > > rte_devargs *devargs) > > > > static struct rte_pci_driver mlx5_driver; > > > > +/* > > + * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, > > +process > > + * local resource used by both primary and secondary to avoid > > +duplicate > > + * reservation. > > + * The space has to be available on both primary and secondary > > +process, > > + * TXQ UAR maps to this area using fixed mmap w/o double check. > > + */ > > +static void *uar_base; > > + > > +/** > > + * Reserve UAR address space for primary process > > + * > > + * @param[in] priv > > + * Pointer to private structure. > > + * > > + * @return > > + * 0 on success, negative errno value on failure. > > + */ > > +static int > > +mlx5_uar_init_primary(struct priv *priv) { > > + void *addr = (void *)0; > > + int i; > > + const struct rte_mem_config *mcfg; > > + > > + if (uar_base) { /* UAR address space mapped */ > > + priv->uar_base = uar_base; > > + return 0; > > + } > > + /* find out lower bound of hugepage segments */ > > + mcfg = rte_eal_get_configuration()->mem_config; > > + for (i = 0; i < RTE_MAX_MEMSEG && mcfg->memseg[i].addr; i++) { > > + if (addr) > > + addr = RTE_MIN(addr, mcfg->memseg[i].addr); > > + else > > + addr = mcfg->memseg[i].addr; > > This if/else is useless as addr is already initialised with the smallest > possible value. That's my original code :-) and I always get addr zero then. Addr here is the lower bound of hugepage, we don't want addr to keep zero. > > > + } > > + /* offset down UAR area */ > > + addr = RTE_PTR_SUB(addr, MLX5_UAR_OFFSET + MLX5_UAR_SIZE); > > Seems the error is here, the loops get the address of the memseg with the > smallest address and then it subtract the UAR size, addr cannot be after > the huge pages unless if this subtraction overflows. Thanks, my word "after" is something like address alloction order, the UAR block under "hugepage" on the overall picture. > > > + /* anonymous mmap, no real memory consumption */ > > + addr = mmap(addr, MLX5_UAR_SIZE, > > + PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > + if (addr == MAP_FAILED) { > > + ERROR("Failed to reserve UAR address space, please adjust " > > + "MLX5_UAR_SIZE or try --base-virtaddr"); > > How does a user knows the UAR memory space the NIC needs to adjust the > MLX5_UAR_SIZE? > > > + return -ENOMEM; > > + } > > + /* Accept either same addr or a new addr returned from mmap if > target > > + * range occupied. > > + */ > > + INFO("Reserved UAR address space: 0x%p", addr); > > The '%p' already prefix the address with the 0x. > > > + priv->uar_base = addr; /* for primary and secondary UAR re-mmap */ > > + uar_base = addr; /* process local, don't reserve again */ > > + return 0; > > +} > > + > <snip/> > > Regards, > > -- > Nélio Laranjeiro > 6WIND