Hi Anatoly > > Shared config is shared across primary and secondary processes. > However,when using rte_malloc, the malloc elements keep references to > the heap inside themselves. This heap reference might not be referencing > a local heap because the heap reference points to the heap of whatever > process has allocated that malloc element. Therefore, there can be > situations when malloc elements in a given heap actually reference > different addresses for the same heap - depending on which process has > allocated the element. This can lead to segmentation faults when dealing > with malloc elements allocated on the same heap by different processes. > > To fix this problem, heaps will now have the same addresses across > processes. In order to achieve that, a new field in a shared mem_config > (a structure that holds the heaps, and which is shared across processes) > was added to keep the address of where this config is mapped in the > primary process. > > Secondary process will now map the config in two stages - first, it'll > map it into an arbitrary address and read the address the primary > process has allocated for the shared config. Then, the config is > unmapped and re-mapped using the address previously read. > > Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com> > --- > lib/librte_eal/common/include/rte_eal_memconfig.h | 5 ++++ > lib/librte_eal/linuxapp/eal/eal.c | 31 > +++++++++++++++++++---- > 2 files changed, 31 insertions(+), 5 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h > b/lib/librte_eal/common/include/rte_eal_memconfig.h > index 30ce6fc..d6359e5 100644 > --- a/lib/librte_eal/common/include/rte_eal_memconfig.h > +++ b/lib/librte_eal/common/include/rte_eal_memconfig.h > @@ -89,6 +89,11 @@ struct rte_mem_config { > > /* Heaps of Malloc per socket */ > struct malloc_heap malloc_heaps[RTE_MAX_NUMA_NODES]; > + > + /* address of mem_config in primary process. used to map shared config > into > + * exact same address the primary process maps it. > + */ > + uint64_t mem_cfg_addr; > } __attribute__((__packed__)); > > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c > b/lib/librte_eal/linuxapp/eal/eal.c > index 6994303..fedd82f 100644 > --- a/lib/librte_eal/linuxapp/eal/eal.c > +++ b/lib/librte_eal/linuxapp/eal/eal.c > @@ -239,6 +239,11 @@ rte_eal_config_create(void) > } > memcpy(rte_mem_cfg_addr, &early_mem_config, sizeof(early_mem_config)); > rte_config.mem_config = (struct rte_mem_config *) rte_mem_cfg_addr; > + > + /* store address of the config in the config itself so that secondary > + * processes could later map the config into this exact location */ > + rte_config.mem_config->mem_cfg_addr = (uintptr_t) rte_mem_cfg_addr; > + > } > > /* attach to an existing shared memory config */ > @@ -246,6 +251,8 @@ static void > rte_eal_config_attach(void) > { > void *rte_mem_cfg_addr; > + struct rte_mem_config *mem_config; > + > const char *pathname = eal_runtime_config_path(); > > if (internal_config.no_shconf) > @@ -257,13 +264,27 @@ rte_eal_config_attach(void) > rte_panic("Cannot open '%s' for rte_mem_config\n", > pathname); > } > > - rte_mem_cfg_addr = mmap(NULL, sizeof(*rte_config.mem_config), > - PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, > 0); > - close(mem_cfg_fd); > - if (rte_mem_cfg_addr == MAP_FAILED) > + /* map it as read-only first */ > + mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config), > + PROT_READ, MAP_SHARED, mem_cfg_fd, 0); > + if (mem_config == MAP_FAILED) > rte_panic("Cannot mmap memory for rte_config\n"); > > - rte_config.mem_config = (struct rte_mem_config *) rte_mem_cfg_addr; > + /* store address used by primary process */ > + rte_mem_cfg_addr = (void *) (uintptr_t) mem_config->mem_cfg_addr; > + > + /* unmap the config */ > + munmap(mem_config, sizeof(*mem_config)); > + > + /* map the config again, with the proper virtual address */ > + mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr, > + sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED, > + mem_cfg_fd, 0); > + if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) > + rte_panic("Cannot mmap memory for rte_config\n"); > + close(mem_cfg_fd); > + > + rte_config.mem_config = mem_config; > } > > /* Detect if we are a primary or a secondary process */ > --
I think we introduce a race window here. If secondary process would do first mmap() before rte_config.mem_config->mem_cfg_addr was properly set by primary process, then it will try to do second mmap() with wrong address. I think we need to do second mmap() straight after rte_eal_mcfg_wait_complete(), or even just inside it. Konstantin