On 17/05/2016 17:39, David Marchand wrote: > Hello Jianfeng, > > On Thu, May 12, 2016 at 2:44 AM, Jianfeng Tan <jianfeng.tan at intel.com> > wrote: >> This patch adds an option, --huge-trybest, to use a recover mechanism to >> the case that there are not so many hugepages (declared in sysfs), which >> can be used. It relys on a mem access to fault-in hugepages, and if fails >> with SIGBUS, recover to previously saved stack environment with >> siglongjmp(). >> >> Besides, this solution fixes an issue when hugetlbfs is specified with an >> option of size. Currently DPDK does not respect the quota of a hugetblfs >> mount. It fails to init the EAL because it tries to map the number of free >> hugepages in the system rather than using the number specified in the quota >> for that mount. >> >> It's still an open issue with CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS. Under >> this case (such as IVSHMEM target), having hugetlbfs mounts with quota will >> fail to remap hugepages as it relies on having mapped all free hugepages >> in the system. > For such a case case, maybe having some warning log message when it > fails would help the user. > + a known issue in the release notes ? > > >> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c >> b/lib/librte_eal/linuxapp/eal/eal_memory.c >> index 5b9132c..8c77010 100644 >> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c >> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c >> @@ -417,12 +434,33 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, >> hugepg_tbl[i].final_va = virtaddr; >> } >> >> + if (orig && internal_config.huge_trybest) { >> + /* In linux, hugetlb limitations, like cgroup, are >> + * enforced at fault time instead of mmap(), even >> + * with the option of MAP_POPULATE. Kernel will send >> + * a SIGBUS signal. To avoid to be killed, save stack >> + * environment here, if SIGBUS happens, we can jump >> + * back here. >> + */ >> + if (wrap_sigsetjmp()) { >> + RTE_LOG(DEBUG, EAL, "SIGBUS: Cannot mmap >> more " >> + "hugepages of size %u MB\n", >> + (unsigned)(hugepage_sz / 0x100000)); >> + munmap(virtaddr, hugepage_sz); >> + close(fd); >> + unlink(hugepg_tbl[i].filepath); >> + return i; >> + } >> + *(int *)virtaddr = 0; >> + } >> + >> + >> /* set shared flock on the file. */ >> if (flock(fd, LOCK_SH | LOCK_NB) == -1) { >> - RTE_LOG(ERR, EAL, "%s(): Locking file failed:%s \n", >> + RTE_LOG(DEBUG, EAL, "%s(): Locking file failed:%s >> \n", >> __func__, strerror(errno)); >> close(fd); >> - return -1; >> + return i; >> } >> >> close(fd); > Maybe I missed something, but we are writing into some hugepage before > the flock has been called. > Are we sure there is nobody else using this hugepage ? > > Especially, can't this cause trouble to a primary process running if > we start the exact same primary process ? >
We lock the hugepage directory during eal_hugepage_info_init(), and we do not unlock until we have finished eal_memory_init. I think that takes care of that case. Sergio