Hello, In this example i've got a 4 vCPU Azure VM with 16G of RAM, 2G of that is given to 1024 2MB huge pages:
$ cat /proc/meminfo | grep -i huge AnonHugePages: 71680 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 1024 HugePages_Free: 1 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 2097152 kB $ There are 2 interfaces which are vpp owned and which are both using the netvsc pmd: $ sudo vppctl sh hard Name Idx Link Hardware GigabitEthernet1 1 up GigabitEthernet1 Link speed: 50 Gbps RX Queues: queue thread mode 0 vpp_wk_0 (1) polling 1 vpp_wk_1 (2) polling Ethernet address 60:45:bd:85:22:97 Microsoft Hyper-V Netvsc carrier up full duplex max-frame-size 0 flags: tx-offload rx-ip4-cksum Devargs: rx: queues 2 (max 64), desc 1024 (min 0 max 65535 align 1) tx: queues 2 (max 64), desc 1024 (min 1 max 4096 align 1) max rx packet len: 65536 promiscuous: unicast off all-multicast off vlan offload: strip off filter off qinq off rx offload avail: vlan-strip ipv4-cksum udp-cksum tcp-cksum rss-hash rx offload active: ipv4-cksum tx offload avail: vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso multi-segs tx offload active: ipv4-cksum udp-cksum tcp-cksum multi-segs rss avail: ipv4-tcp ipv4-udp ipv4 ipv6-tcp ipv6 rss active: ipv4-tcp ipv4 ipv6-tcp ipv6 tx burst function: (not available) rx burst function: (not available) GigabitEthernet2 2 up GigabitEthernet2 Link speed: 50 Gbps RX Queues: queue thread mode 0 vpp_wk_2 (3) polling 1 vpp_wk_0 (1) polling Ethernet address 60:45:bd:85:23:94 Microsoft Hyper-V Netvsc carrier up full duplex max-frame-size 0 flags: tx-offload rx-ip4-cksum Devargs: rx: queues 2 (max 64), desc 1024 (min 0 max 65535 align 1) tx: queues 2 (max 64), desc 1024 (min 1 max 4096 align 1) max rx packet len: 65536 promiscuous: unicast off all-multicast off vlan offload: strip off filter off qinq off rx offload avail: vlan-strip ipv4-cksum udp-cksum tcp-cksum rss-hash rx offload active: ipv4-cksum tx offload avail: vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso multi-segs tx offload active: ipv4-cksum udp-cksum tcp-cksum multi-segs rss avail: ipv4-tcp ipv4-udp ipv4 ipv6-tcp ipv6 rss active: ipv4-tcp ipv4 ipv6-tcp ipv6 tx burst function: (not available) rx burst function: (not available) local0 0 down local0 Link speed: unknown local $ Config file looks like this: unix { nodaemon log /var/log/vpp/vpp.log full-coredump cli-listen /run/vpp/cli.sock gid vpp } api-trace { on } api-segment { gid vpp } socksvr { socket-name /run/vpp/api.sock } plugins { # Common plugins. plugin default { disable } plugin dpdk_plugin.so { enable } plugin linux_cp_plugin.so { enable } plugin crypto_native_plugin.so { enable } < -- snip lots of plugins -- > } dpdk { # VMBUS UUID. dev 6045bd85-2297-6045-bd85-22976045bd85 { num-rx-queues 4 num-tx-queues 4 name GigabitEthernet1 } # VMBUS UUID. dev 6045bd85-2394-6045-bd85-23946045bd85 { num-rx-queues 4 num-tx-queues 4 name GigabitEthernet2 } } cpu { skip-cores 0 main-core 0 corelist-workers 1-3 } buffers { # Max buffers based on data size & huge page configuration. buffers-per-numa 853440 default data-size 2048 page-size default-hugepage } statseg { size 128M } My issue is that I start to see errors from the mlnx5 driver when using a large number of buffers: 2022/06/29 12:44:11:427 notice dpdk common_mlx5: Unable to find virtually contiguous chunk for address (0x1000000000). rte_memseg_contig_walk() failed. 2022/06/29 12:44:11:427 notice dpdk common_mlx5: Unable to find virtually contiguous chunk for address (0x103fe00000). rte_memseg_contig_walk() failed. 2022/06/29 12:44:11:427 notice dpdk common_mlx5: Unable to find virtually contiguous chunk for address (0x1040000000). rte_memseg_contig_walk() failed. 2022/06/29 12:44:11:427 notice dpdk common_mlx5: Unable to find virtually contiguous chunk for address (0x1040200000). rte_memseg_contig_walk() failed. The spew continues. With a smaller number of buffers I don't see this problem and there are no issues with the packet forwarding side of things. I'm not sure what the buffer limit is before things go bad. I read the excellent description of how buffer sizes are calculated here: https://lists.fd.io/g/vpp-dev/topic/buffer_occupancy_calculation/76605334?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,20,76605334 and as a result though it would be a good idea to allocate buffers based on buffer size and the available of memory in huge pages, however when the buffer size is large "enough" the common_mlx5 errors start to spew. I don't see this issue on other platforms where I am able to max out buffers based on huge page allocation. I was pointed towards https://doc.dpdk.org/guides/platform/mlx5.html#mlx5-common-driver-options and mr_ext_memseg_en which would supress this notice. However I can only pass dpdk eal options to the netvsc pmd and not the mlx5, so this does not seem to be an option. Ideally what I would like to do is max out the number of buffers based on available hugepage memory since on some setups, if there is a cap on mappable buffers allowed for this specific device (mlx5) then I could cap to that number instead of using max buffers based on huge page availability. Thanks, Peter.
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#21593): https://lists.fd.io/g/vpp-dev/message/21593 Mute This Topic: https://lists.fd.io/mt/92064311/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-