[dpdk-dev] DPDK library grab all the memory during start up

2016-03-10 Thread John Wei
I am setting up open-vswitch with DPDK in container, and running many this
OVS/DPDK containers on the same host.
OVS in each container will be using differ PCI deviced bound to DPDK.
I am using --file-prefix to allow sharing of same /dev/hugepages tlbfs, and
using --socket-mem to limit the memory used by each OVS.
But, DPDK library first grab all the available memory, pick the best
memory, before releasing memory not needed. It seems tat this process is
serialized.
Each DPDK app will need to wait the previous app to complete that process,
before next app can start grabing, picking, and releasing memory.
This is taking a long time when you try to start many DPDK app in parallel.
I tried to use different tlbfs for each app, and limit each tlbfs with
nr_inodes, but that does not work.

Any suggestion on addressing this issue? Is there way to tell DPDK library
not to grab so much memory?

John


[dpdk-dev] Fwd: EAL: map_all_hugepages(): mmap failed: Cannot allocate memory

2016-03-17 Thread John Wei
I am setting up OVS inside a Linux container. This OVS is built using DPDK
library.
During the startup of ovs-vswitchd, it core dumped due to fail to mmap.
  in eal_memory.c
   virtaddr = mmap(vma_addr, hugepage_sz, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);

This call is made inside a for loop that loops through all the pages and
mmap them.
My server has two cores, and I allocated 8192 2MB pages.
The mmap for the first 4096 pages were successful. It failed when trying to
map 4096th page.

Can someone help me understand when the mmap for the first 4096 pages were
successful and it failed on 4096th page?


John



ovs-vswitchd --dpdk -c 0x1 -n 4 -l 1 --file-prefix ct- --socket-mem
128,128 -- unix:$DB_SOCK --pidfile --detach --log-file=ct.log


EAL: Detected lcore 23 as core 5 on socket 1
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 24 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up physically contiguous memory...
EAL: map_all_hugepages(): mmap failed: Cannot allocate memory
EAL: Failed to mmap 2 MB hugepages
PANIC in rte_eal_init():
Cannot init memory
7: [ovs-vswitchd() [0x411f15]]
6: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff5f6133b15]]
5: [ovs-vswitchd() [0x4106f9]]
4: [ovs-vswitchd() [0x66917d]]
3: [ovs-vswitchd() [0x42b6f5]]
2: [ovs-vswitchd() [0x40dd8c]]
1: [ovs-vswitchd() [0x56b3ba]]
Aborted (core dumped)


[dpdk-dev] Fwd: EAL: map_all_hugepages(): mmap failed: Cannot allocate memory

2016-03-18 Thread John Wei
Thanks for the reply. Upon further debugging, I was able to root caused the
issue. In the cgroup, in addition to limiting the CPU, I also limited the
node where my OVS can allocate the memory (cpuset.mems). I understand that
DPDK first grab all the memory, then pick the best memory pages, then
release the rest. But this is taking a long time for my case that I started
many OVSs on the same host.
Each DPDK app will need to wait for previous app to release the memory
before next app can proceed.
In addition, since I have specified that (through cgroup cpuset.mems) dont
get memory from other node, DPDK library may be can skipp grabbing memory
from these excluded nodes?

Just some thoughts.

John


On Thu, Mar 17, 2016 at 7:51 PM, Tan, Jianfeng 
wrote:

>
>
> On 3/18/2016 6:41 AM, John Wei wrote:
>
> I am setting up OVS inside a Linux container. This OVS is built using DPDK
> library.
> During the startup of ovs-vswitchd, it core dumped due to fail to mmap.
>   in eal_memory.c
>virtaddr = mmap(vma_addr, hugepage_sz, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
>
> This call is made inside a for loop that loops through all the pages and
> mmap them.
> My server has two cores, and I allocated 8192 2MB pages.
> The mmap for the first 4096 pages were successful. It failed when trying to
> map 4096th page.
>
> Can someone help me understand when the mmap for the first 4096 pages were
> successful and it failed on 4096th page?
>
>
> In my limited experience, there are some scenario that may lead to such
> failure: a. specified an option size when do mount hugetlbfs; b. cgroup
> limitation, /sys/fs/cgroup/hugetlb/ name>/hugetlb.2MB.limit_in_bytes; c. open files by ulimit...
>
> Workaround: as only "--socket-mem 128,128" is needed, you can reduce the
> total number of 2M hugepages from 8192 to 512 (or something else).
> In addition: this is a case why I sent a patchset:
> http://dpdk.org/dev/patchwork/patch/11194/
>
> Thanks,
> Jianfeng
>
>
>
> John
>
>
>
> ovs-vswitchd --dpdk -c 0x1 -n 4 -l 1 --file-prefix ct- --socket-mem
> 128,128 -- unix:$DB_SOCK --pidfile --detach --log-file=ct.log
>
>
> EAL: Detected lcore 23 as core 5 on socket 1
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 24 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: VFIO modules not all loaded, skip VFIO support...
> EAL: Setting up physically contiguous memory...
> EAL: map_all_hugepages(): mmap failed: Cannot allocate memory
> EAL: Failed to mmap 2 MB hugepages
> PANIC in rte_eal_init():
> Cannot init memory
> 7: [ovs-vswitchd() [0x411f15]]
> 6: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff5f6133b15]]
> 5: [ovs-vswitchd() [0x4106f9]]
> 4: [ovs-vswitchd() [0x66917d]]
> 3: [ovs-vswitchd() [0x42b6f5]]
> 2: [ovs-vswitchd() [0x40dd8c]]
> 1: [ovs-vswitchd() [0x56b3ba]]
> Aborted (core dumped)
>
>
>