On Thu, 15 Apr 2021 15:24:01 +0100 "Burakov, Anatoly" <anatoly.bura...@intel.com> wrote:
> On 25-Mar-21 8:21 AM, xiangxia.m....@gmail.com wrote: > > From: Tonghao Zhang <xiangxia.m....@gmail.com> > > > > The hugepage of different size, 2MB, 1GB may be mounted on > > the same directory (e.g /dev/hugepages). Then dpdk > > primary process will be blocked. To address this issue, > > add the LOCK_NB flags to flock(). > > > > $ cat /proc/mounts > > ... > > none /dev/hugepages hugetlbfs rw,seclabel,relatime,pagesize=1024M 0 0 > > none /dev/hugepages hugetlbfs rw,seclabel,relatime,pagesize=2M 0 0 > > > > Add more details for err logs. > > > > Signed-off-by: Tonghao Zhang <xiangxia.m....@gmail.com> > > --- > > lib/librte_eal/linux/eal_hugepage_info.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/lib/librte_eal/linux/eal_hugepage_info.c > > b/lib/librte_eal/linux/eal_hugepage_info.c > > index d97792cadeb6..1ff76e539053 100644 > > --- a/lib/librte_eal/linux/eal_hugepage_info.c > > +++ b/lib/librte_eal/linux/eal_hugepage_info.c > > @@ -451,9 +451,12 @@ hugepage_info_init(void) > > hpi->lock_descriptor = open(hpi->hugedir, O_RDONLY); > > > > /* if blocking lock failed */ > > - if (flock(hpi->lock_descriptor, LOCK_EX) == -1) { > > + if (flock(hpi->lock_descriptor, LOCK_EX | LOCK_NB) == -1) { > > RTE_LOG(CRIT, EAL, > > - "Failed to lock hugepage directory!\n"); > > + "Failed to lock hugepage directory! " > > + "The hugepage dir (%s) was locked by " > > + "other processes or self twice.\n", > > + hpi->hugedir); > > break; > > } > > /* clear out the hugepages dir from unused pages */ > > > > Use cases such as "having two hugetlbfs page sizes on the same hugetlbfs > mountpoint" are user error, but i agree that deadlocking is probably not > the way we want to go about it. > > An alternative way would be to check if we already have a mountpoint > with the same path, and this would produce a better error message (as a > user, "hugepage dir is locked by self twice" doesn't tell me anything > useful), at a cost of slightly more complicated code. > > I'm not sure which way i want to go here. Normally, hugetlbfs shouldn't > be staying locked for long, so i'm wary of adding a LOCK_NB here, so i > feel slightly uneasy about this patch. Do you have any opinions? > > Also, do other OS's EALs need similar fix? > Dropping this patch. It is one of those: "It hurts when I do this stupid thing" patches.