On 16.02.2017 16:55, Ilya Maximets wrote: > Hi, > > On 16.02.2017 16:26, Tan, Jianfeng wrote: >> Hi, >> >>> -----Original Message----- >>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >>> Sent: Thursday, February 16, 2017 9:01 PM >>> To: dev@dpdk.org; David Marchand; Gonzalez Monroy, Sergio >>> Cc: Heetae Ahn; Yuanhan Liu; Tan, Jianfeng; Neil Horman; Pei, Yulong; Ilya >>> Maximets; sta...@dpdk.org >>> Subject: [PATCH] mem: balanced allocation of hugepages >>> >>> Currently EAL allocates hugepages one by one not paying >>> attention from which NUMA node allocation was done. >>> >>> Such behaviour leads to allocation failure if number of >>> available hugepages for application limited by cgroups >>> or hugetlbfs and memory requested not only from the first >>> socket. >>> >>> Example: >>> # 90 x 1GB hugepages availavle in a system >>> >>> cgcreate -g hugetlb:/test >>> # Limit to 32GB of hugepages >>> cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test >>> # Request 4GB from each of 2 sockets >>> cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ... >>> >>> EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB >>> EAL: 32 not 90 hugepages of size 1024 MB allocated >>> EAL: Not enough memory available on socket 1! >>> Requested: 4096MB, available: 0MB >>> PANIC in rte_eal_init(): >>> Cannot init memory >>> >>> This happens beacause all allocated pages are >>> on socket 0. >> >> For such an use case, why not just use "numactl --interleave=0,1 <DPDK app> >> xxx"? > > Unfortunately, interleave policy doesn't work for me. I suspect kernel > configuration > blocks this or I don't understand something in kernel internals. > I'm using 3.10 rt kernel from rhel7. > > I tried to set up MPOL_INTERLEAVE in code and it doesn't work for me. Your > example > with numactl doesn't work too: > > # Limited to 8GB of hugepages > cgexec -g hugetlb:test testpmd --socket-mem=4096,4096
Sorry, cgexec -g hugetlb:test numactl --interleave=0,1 ./testpmd --socket-mem=4096,4096 .. > > EAL: Setting up physically contiguous memory... > EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB > EAL: 8 not 90 hugepages of size 1024 MB allocated > EAL: Hugepage /dev/hugepages/rtemap_0 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_1 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_2 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_3 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_4 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_5 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_6 is on socket 0 > EAL: Hugepage /dev/hugepages/rtemap_7 is on socket 0 > EAL: Not enough memory available on socket 1! Requested: 4096MB, available: > 0MB > PANIC in rte_eal_init(): > Cannot init memory > > Also, using numactl will affect all the allocations in application. This may > cause additional unexpected issues. > >> >> Do you see use case like --socket-mem 2048,1024 and only three 1GB-hugepage >> are allowed? > > This case will work with my patch. > But the opposite one '--socket-mem=1024,2048' will fail. > To be clear, we need to allocate all required memory at first > from each numa node and then allocate all other available pages > in round-robin fashion. But such solution looks a little ugly. > > What do you think? > > Best regards, Ilya Maximets. > >