Hi Florin, After fixing the UDP checksum offload issue and using the 64K tx buffer, I am able to send 35Gbps ( half duplex) . In DPDK code ( ./plugins/dpdk/device/init.c) , it was not setting the DEV_TX_OFFLOAD_TCP_CKSUM and DEV_TX_OFFLOAD_UDP_CKSUM offload bit for MLNX5 PMD. In the udp tx application I am using vppcom_session_write to write to the session and write len is same as the buffer size ( 64K).
Btw, I run all the tests with the patch https://gerrit.fd.io/r/c/vpp/+/24462 you provided. If I run a single UDP tx connection then the throughput is 35 Gbps. But, on starting other UDP rx connections (20 Gbps) the tx throughput goes down to 12Gbps. Even , if I run 2 UDP tx connection then also I am not able to scale up the throughput. The overall throughput stays the same. First I tried this test with 4 worker threads and then with 1 worker thread. I have following 2 points - 1) With my udp tx test application, I am getting this throughput after using 64K tx buffer. But , in actual product I have to send the variable size UDP packets ( max len 9000 bytes) . That mean the maximum tx buffer size would be 9K and with that buffer size I am getting 15Gbps which is fine if I can some how scale up it by running multiple applications. But, that does not seems to work with UDP ( I am not using udpc). 2) My target is the achieve at least 30 Gbps rx and 30 Gbps tx UDP throughput on one NUMA node. I tried by running the multiple VPP instances on VFs ( SR-IOV) and I can scale up the throughput ( rx and tx) with the number of VPP instances. Here is the throughput test with VF - 1 VPP instance ( 15Gbps rx and 15Gbps tx) 2 VPP instances ( 30Gbps rx and 30 Ghps tx) 3 VPP instances ( 45 Gbps rx and 35Gbps tx) I have 2 NUMA node on the serer so I am expecting to get 60 Gbps rx and 60 Gbps rx total throughput. Btw, I also tested TCP without VF. It seems to scale up properly as the connections are going on different threads. *vpp# sh thread* *ID Name Type LWP Sched Policy (Priority) lcore Core Socket State* *0 vpp_main 22181 other (0) 1 0 0* *1 vpp_wk_0 workers 22183 other (0) 2 2 0* *2 vpp_wk_1 workers 22184 other (0) 3 3 0* *3 vpp_wk_2 workers 22185 other (0) 4 4 0* *4 vpp_wk_3 workers 22186 other (0) 5 8 0* *4 worker threads * *Iperf3 TCP tests - 8000 bytes packets * 1 Connection: Rx only 18 Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 Thread 0: active sessions 1 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 1: active sessions 1 Thread 2: no sessions Thread 3: no sessions Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 4: active sessions 1 2 connections: Rx only 32Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 [0:1][T] fd0d:edc4:ffff:2001::203:6679->:::0 LISTEN 0 0 Thread 0: active sessions 2 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 1: active sessions 1 Thread 2: no sessions Thread 3: no sessions Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [4:1][T] fd0d:edc4:ffff:2001::203:6679->fd0d:edc4:ESTABLISHED 0 0 [4:2][T] fd0d:edc4:ffff:2001::203:6679->fd0d:edc4:ESTABLISHED 0 0 Thread 4: active sessions 3 3 connection Rx only 43Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 [0:1][T] fd0d:edc4:ffff:2001::203:6679->:::0 LISTEN 0 0 [0:2][T] fd0d:edc4:ffff:2001::203:6689->:::0 LISTEN 0 0 Thread 0: active sessions 3 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 1: active sessions 1 Thread 2: no sessions Connection State Rx-f Tx-f [3:0][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 3: active sessions 1 Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [4:1][T] fd0d:edc4:ffff:2001::203:6679->fd0d:edc4:ESTABLISHED 0 0 [4:2][T] fd0d:edc4:ffff:2001::203:6679->fd0d:edc4:ESTABLISHED 0 0 [4:3][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 4: active sessions 4 2 connection 1 -Rx 1 -Tx Rx – 18Gbps Tx – 12 Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 Thread 0: active sessions 1 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 1: active sessions 1 Thread 2: no sessions Connection State Rx-f Tx-f [3:0][T] fd0d:edc4:ffff:2001::203:10376->fd0d:edc4ESTABLISHED 0 0 [3:1][T] fd0d:edc4:ffff:2001::203:12871->fd0d:edc4ESTABLISHED 0 3999999 Thread 3: active sessions 2 Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 4: active sessions 1 4 connections 2 – Rx 2-Tx Rx – 27 Gbps Tx – 24 Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 [0:1][T] fd0d:edc4:ffff:2001::203:6689->:::0 LISTEN 0 0 Thread 0: active sessions 2 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 Thread 1: active sessions 1 Connection State Rx-f Tx-f [2:0][T] fd0d:edc4:ffff:2001::203:51962->fd0d:edc4ESTABLISHED 0 0 [2:1][T] fd0d:edc4:ffff:2001::203:56849->fd0d:edc4ESTABLISHED 0 3999999 [2:2][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 2: active sessions 3 Connection State Rx-f Tx-f [3:1][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 3: active sessions 1 Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [4:1][T] fd0d:edc4:ffff:2001::203:57550->fd0d:edc4ESTABLISHED 0 3999999 [4:2][T] fd0d:edc4:ffff:2001::203:56939->fd0d:edc4ESTABLISHED 0 0 Thread 4: active sessions 3 5 connections 2 – Rx 3 -Tx Rx – 27 Gbps Tx – 28 Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 [0:1][T] fd0d:edc4:ffff:2001::203:6689->:::0 LISTEN 0 0 [0:2][T] fd0d:edc4:ffff:2001::203:7729->:::0 LISTEN 0 0 Thread 0: active sessions 3 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [1:2][T] fd0d:edc4:ffff:2001::203:39216->fd0d:edc4ESTABLISHED 0 3999999 Thread 1: active sessions 2 Connection State Rx-f Tx-f [2:0][T] fd0d:edc4:ffff:2001::203:51962->fd0d:edc4ESTABLISHED 0 0 [2:1][T] fd0d:edc4:ffff:2001::203:56849->fd0d:edc4ESTABLISHED 0 3999999 [2:2][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 2: active sessions 3 Connection State Rx-f Tx-f [3:0][T] fd0d:edc4:ffff:2001::203:29141->fd0d:edc4ESTABLISHED 0 0 [3:1][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 3: active sessions 2 Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [4:1][T] fd0d:edc4:ffff:2001::203:57550->fd0d:edc4ESTABLISHED 0 3999999 [4:2][T] fd0d:edc4:ffff:2001::203:56939->fd0d:edc4ESTABLISHED 0 0 Thread 4: active sessions 3 6 connection 3 – Rx 3 – Tx Rx – 41 Gbps Tx – 13 Gbps vpp# sh session verbose 1 Connection State Rx-f Tx-f [0:0][T] fd0d:edc4:ffff:2001::203:6669->:::0 LISTEN 0 0 [0:1][T] fd0d:edc4:ffff:2001::203:6689->:::0 LISTEN 0 0 [0:2][T] fd0d:edc4:ffff:2001::203:7729->:::0 LISTEN 0 0 Thread 0: active sessions 3 Connection State Rx-f Tx-f [1:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [1:1][T] fd0d:edc4:ffff:2001::203:7729->fd0d:edc4:ESTABLISHED 0 0 [1:2][T] fd0d:edc4:ffff:2001::203:39216->fd0d:edc4ESTABLISHED 0 3999999 Thread 1: active sessions 3 Connection State Rx-f Tx-f [2:0][T] fd0d:edc4:ffff:2001::203:51962->fd0d:edc4ESTABLISHED 0 0 [2:1][T] fd0d:edc4:ffff:2001::203:56849->fd0d:edc4ESTABLISHED 0 3999999 [2:2][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 [2:3][T] fd0d:edc4:ffff:2001::203:7729->fd0d:edc4:ESTABLISHED 0 0 Thread 2: active sessions 4 Connection State Rx-f Tx-f [3:0][T] fd0d:edc4:ffff:2001::203:29141->fd0d:edc4ESTABLISHED 0 0 [3:1][T] fd0d:edc4:ffff:2001::203:6689->fd0d:edc4:ESTABLISHED 0 0 Thread 3: active sessions 2 Connection State Rx-f Tx-f [4:0][T] fd0d:edc4:ffff:2001::203:6669->fd0d:edc4:ESTABLISHED 0 0 [4:1][T] fd0d:edc4:ffff:2001::203:57550->fd0d:edc4ESTABLISHED 0 3999999 [4:2][T] fd0d:edc4:ffff:2001::203:56939->fd0d:edc4ESTABLISHED 0 0 Thread 4: active sessions 3 thanks, -Raj On Tue, Jan 21, 2020 at 9:43 PM Florin Coras <fcoras.li...@gmail.com> wrote: > Hi Raj, > > Inline. > > On Jan 21, 2020, at 3:41 PM, Raj Kumar <raj.gauta...@gmail.com> wrote: > > Hi Florin, > There is no drop on the interfaces. It is 100G card. > In UDP tx application, I am using 1460 bytes of buffer to send on > select(). I am getting 5 Gbps throughput ,but if I start one more > application then total throughput goes down to 4 Gbps as both the sessions > are on the same thread. > I increased the tx buffer to 8192 bytes and then I can get 11 Gbps > throughput but again if I start one more application the throughput goes > down to 10 Gbps. > > > FC: I assume you’re using vppcom_session_write to write to the session. > How large is “len” typically? See lower on why that matters. > > > > I found one issue in the code ( You must be aware of that) , the UDP send > MSS is hard-coded to 1460 ( /vpp/src/vnet/udp/udp.c file). So, the large > packets are getting fragmented. > > udp_send_mss (transport_connection_t * t) > { > /* TODO figure out MTU of output interface */ > return 1460; > } > > > FC: That’s a typical mss and actually what tcp uses as well. Given the > nics, they should be fine sending a decent number of mpps without the need > to do jumbo ip datagrams. > > if I change the MSS to 8192 then I am getting 17 Mbps throughput. But , if > i start one more application then throughput is going down to 13 Mbps. > > > It looks like the 17 Mbps is per core limit and since all the sessions are > pined to the same thread we can not get more throughput. Here, per core > throughput look good to me. Please let me know there is any way to use > multiple threads for UDP tx applications. > > > In your previous email you mentioned that we can use connected udp socket > in the UDP receiver. Can we do something similar for UDP tx ? > > > FC: I think it may work fine if vpp has main + 1 worker. I have a draft > patch here [1] that seems to work with multiple workers but it’s not > heavily tested. > > Out of curiosity, I ran a vcl_test_client/server test with 1 worker and > with XL710s, I’m seeing this: > > CLIENT RESULTS: Streamed 65536017791 bytes > in 14.392678 seconds (36.427420 Gbps half-duplex)! > > Should be noted that because of how datagrams are handled in the session > layer, throughput is sensitive to write sizes. I ran the client like: > ~/vcl_client -p udpc 6.0.1.2 1234 -U -N 1000000 -T 65536 > > Or in english, unidirectional test, tx buffer of 64kB and 1M writes of > that buffer. My vcl config was such that tx fifos were 4MB and rx fifos > 2MB. The sender had few tx packet drops (1657) and the receiver few rx > packet drops (801). If you plan to use it, make sure arp entries are first > resolved (e.g., use ping) otherwise the first packet is lost. > > Throughput drops to ~15Gbps with 8kB writes. You should probably also test > with bigger writes with udp. > > [1] https://gerrit.fd.io/r/c/vpp/+/24462 > > > From the hardware stats , it seems that UDP tx checksum offload is not > enabled/active which could impact the performance. I think, udp tx > checksum should be enabled by default if it is not disabled using > parameter "no-tx-checksum-offload". > > > FC: Performance might be affected by the limited number of offloads > available. Here’s what I see on my XL710s: > > rx offload active: ipv4-cksum jumbo-frame scatter > tx offload active: udp-cksum tcp-cksum multi-segs > > > Ethernet address b8:83:03:79:af:8c > Mellanox ConnectX-4 Family > carrier up full duplex mtu 9206 > flags: admin-up pmd maybe-multiseg subif rx-ip4-cksum > rx: queues 5 (max 65535), desc 1024 (min 0 max 65535 align 1) > > > FC: Are you running with 5 vpp workers? > > Regards, > Florin > > tx: queues 6 (max 65535), desc 1024 (min 0 max 65535 align 1) > pci: device 15b3:1017 subsystem 1590:0246 address 0000:12:00.00 numa 0 > max rx packet len: 65536 > promiscuous: unicast off all-multicast on > vlan offload: strip off filter off qinq off > rx offload avail: vlan-strip ipv4-cksum udp-cksum tcp-cksum > vlan-filter > jumbo-frame scatter timestamp keep-crc > rx offload active: ipv4-cksum jumbo-frame scatter > tx offload avail: vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso > outer-ipv4-cksum vxlan-tnl-tso gre-tnl-tso > multi-segs > udp-tnl-tso ip-tnl-tso > tx offload active: multi-segs > rss avail: ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4 > ipv6-tcp-ex > ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other > ipv6-ex ipv6 > rss active: ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv4 > ipv6-tcp-ex > ipv6-udp-ex ipv6-frag ipv6-tcp ipv6-udp ipv6-other > ipv6-ex ipv6 > tx burst function: (nil) > rx burst function: mlx5_rx_burst > > thanks, > -Raj > > On Mon, Jan 20, 2020 at 7:55 PM Florin Coras <fcoras.li...@gmail.com> > wrote: > >> Hi Raj, >> >> Good to see progress. Check with “show int” the tx counters on the sender >> and rx counters on the receiver as the interfaces might be dropping >> traffic. One sender should be able to do more than 5Gbps. >> >> How big are the writes to the tx fifo? Make sure the tx buffer is some >> tens of kB. >> >> As for the issue with the number of workers, you’ll have to switch to >> udpc (connected udp), to ensure you have a separate connection for each >> ‘flow’, and to use accept in combination with epoll to accept the sessions >> udpc creates. >> >> Note that udpc currently does not work correctly with vcl and multiple >> vpp workers if vcl is the sender (not the receiver) and traffic is >> bidirectional. The sessions are all created on the first thread and once >> return traffic is received, they’re migrated to the thread selected by RSS >> hashing. VCL is not notified when that happens and it runs out of sync. You >> might not be affected by this, as you’re not receiving any return traffic, >> but because of that all sessions may end up stuck on the first thread. >> >> For udp transport, the listener is connection-less and bound to the main >> thread. As a result, all incoming packets, even if they pertain to multiple >> flows, are written to the listener’s buffer/fifo. >> >> Regards, >> Florin >> >> On Jan 20, 2020, at 3:50 PM, Raj Kumar <raj.gauta...@gmail.com> wrote: >> >> Hi Florin, >> I changed my application as you suggested. Now, I am able to achieve 5 >> Gbps with a single UDP stream. Overall, I can get ~20Gbps with multiple >> host application . Also, the TCP throughput is improved to ~28Gbps after >> tuning as mentioned in [1]. >> On the similar topic; the UDP tx throughput is throttled to 5Gbps. Even >> if I run the multiple host applications the overall throughput is 5Gbps. I >> also tried by configuring multiple worker threads . But the problem is that >> all the application sessions are assigned to the same worker thread. Is >> there any way to assign each session to a different worker thread? >> >> vpp# sh session verbose 2 >> Thread 0: no sessions >> [#1][U] fd0d:edc4:ffff:2001::203:58926->fd0d:edc4: >> Rx fifo: cursize 0 nitems 3999999 has_event 0 >> head 0 tail 0 segment manager 1 >> vpp session 0 thread 1 app session 0 thread 0 >> ooo pool 0 active elts newest 0 >> Tx fifo: cursize 3999999 nitems 3999999 has_event 1 >> head 1460553 tail 1460552 segment manager 1 >> vpp session 0 thread 1 app session 0 thread 0 >> ooo pool 0 active elts newest 4294967295 >> session: state: opened opaque: 0x0 flags: >> [#1][U] fd0d:edc4:ffff:2001::203:63413->fd0d:edc4: >> Rx fifo: cursize 0 nitems 3999999 has_event 0 >> head 0 tail 0 segment manager 2 >> vpp session 1 thread 1 app session 0 thread 0 >> ooo pool 0 active elts newest 0 >> Tx fifo: cursize 3999999 nitems 3999999 has_event 1 >> head 3965434 tail 3965433 segment manager 2 >> vpp session 1 thread 1 app session 0 thread 0 >> ooo pool 0 active elts newest 4294967295 >> session: state: opened opaque: 0x0 flags: >> Thread 1: active sessions 2 >> Thread 2: no sessions >> Thread 3: no sessions >> Thread 4: no sessions >> Thread 5: no sessions >> Thread 6: no sessions >> Thread 7: no sessions >> vpp# sh app client >> Connection App >> [#1][U] fd0d:edc4:ffff:2001::203:58926->udp6_tx_8092[shm] >> [#1][U] fd0d:edc4:ffff:2001::203:63413->udp6_tx_8093[shm] >> vpp# >> >> >> >> thanks, >> -Raj >> >> On Sun, Jan 19, 2020 at 8:50 PM Florin Coras <fcoras.li...@gmail.com> >> wrote: >> >>> Hi Raj, >>> >>> The function used for receiving datagrams is limited to reading at most >>> the length of a datagram from the rx fifo. UDP datagrams are mtu sized, so >>> your reads are probably limited to ~1.5kB. On each epoll rx event try >>> reading from the session handle in a while loop until you get an >>> VPPCOM_EWOULDBLOCK. That might improve performance. >>> >>> Having said that, udp is lossy so unless you implement your own >>> congestion/flow control algorithms, the data you’ll receive might be full >>> of “holes”. What are the rx/tx error counters on your interfaces (check >>> with “sh int”). >>> >>> Also, with simple tuning like this [1], you should be able to achieve >>> much more than 15Gbps with tcp. >>> >>> Regards, >>> Florin >>> >>> [1] https://wiki.fd.io/view/VPP/HostStack/LDP/iperf >>> >>> On Jan 19, 2020, at 3:25 PM, Raj Kumar <raj.gauta...@gmail.com> wrote: >>> >>> Hi Florin, >>> By using VCL library in an UDP receiver application, I am able to >>> receive only 2 Mbps traffic. On increasing the traffic, I see Rx FIFO full >>> error and application stopped receiving the traffic from the session >>> layer. Whereas, with TCP I can easily achieve 15Gbps throughput without >>> tuning any DPDK parameter. UDP tx also looks fine. From an host >>> application I can send ~5Gbps without any issue. >>> >>> I am running VPP( stable/2001 code) on RHEL8 server using Mellanox 100G >>> (MLNX5) adapters. >>> Please advise if I can use VCL library to receive high throughput UDP >>> traffic ( in Gbps). I would be running multiple instances of host >>> application to receive data ( ~50-60 Gbps). >>> >>> I also tried by increasing the Rx FIFO size to 16MB but did not help >>> much. The host application is just throwing the received packets , it is >>> not doing any packet processing. >>> >>> [root@orc01 vcl_test]# VCL_DEBUG=2 ./udp6_server_vcl >>> VCL<20201>: configured VCL debug level (2) from VCL_DEBUG! >>> VCL<20201>: allocated VCL heap = 0x7f39a17ab010, size 268435456 >>> (0x10000000) >>> VCL<20201>: configured rx_fifo_size 4000000 (0x3d0900) >>> VCL<20201>: configured tx_fifo_size 4000000 (0x3d0900) >>> VCL<20201>: configured app_scope_local (1) >>> VCL<20201>: configured app_scope_global (1) >>> VCL<20201>: configured api-socket-name (/tmp/vpp-api.sock) >>> VCL<20201>: completed parsing vppcom config! >>> vppcom_connect_to_vpp:480: vcl<20201:0>: app (udp6_server) is connected >>> to VPP! >>> vppcom_app_create:1104: vcl<20201:0>: sending session enable >>> vppcom_app_create:1112: vcl<20201:0>: sending app attach >>> vppcom_app_create:1121: vcl<20201:0>: app_name 'udp6_server', >>> my_client_index 256 (0x100) >>> vppcom_epoll_create:2439: vcl<20201:0>: Created vep_idx 0 >>> vppcom_session_create:1179: vcl<20201:0>: created session 1 >>> vppcom_session_bind:1317: vcl<20201:0>: session 1 handle 1: binding to >>> local IPv6 address fd0d:edc4:ffff:2001::203 port 8092, proto UDP >>> vppcom_session_listen:1349: vcl<20201:0>: session 1: sending vpp listen >>> request... >>> vcl_session_bound_handler:604: vcl<20201:0>: session 1 [0x1]: listen >>> succeeded! >>> vppcom_epoll_ctl:2541: vcl<20201:0>: EPOLL_CTL_ADD: vep_sh 0, sh 1, >>> events 0x1, data 0x1! >>> vppcom_session_create:1179: vcl<20201:0>: created session 2 >>> vppcom_session_bind:1317: vcl<20201:0>: session 2 handle 2: binding to >>> local IPv6 address fd0d:edc4:ffff:2001::203 port 8093, proto UDP >>> vppcom_session_listen:1349: vcl<20201:0>: session 2: sending vpp listen >>> request... >>> vcl_session_app_add_segment_handler:765: vcl<20201:0>: mapped new >>> segment '20190-2' size 134217728 >>> vcl_session_bound_handler:604: vcl<20201:0>: session 2 [0x2]: listen >>> succeeded! >>> vppcom_epoll_ctl:2541: vcl<20201:0>: EPOLL_CTL_ADD: vep_sh 0, sh 2, >>> events 0x1, data 0x2! >>> >>> >>> vpp# sh session verbose 2 >>> [#0][U] fd0d:edc4:ffff:2001::203:8092->:::0 >>> >>> Rx fifo: cursize 3999125 nitems 3999999 has_event 1 >>> head 2554045 tail 2553170 segment manager 1 >>> vpp session 0 thread 0 app session 1 thread 0 >>> ooo pool 0 active elts newest 4294967295 >>> Tx fifo: cursize 0 nitems 3999999 has_event 0 >>> head 0 tail 0 segment manager 1 >>> vpp session 0 thread 0 app session 1 thread 0 >>> ooo pool 0 active elts newest 0 >>> [#0][U] fd0d:edc4:ffff:2001::203:8093->:::0 >>> >>> Rx fifo: cursize 0 nitems 3999999 has_event 0 >>> head 0 tail 0 segment manager 2 >>> vpp session 1 thread 0 app session 2 thread 0 >>> ooo pool 0 active elts newest 0 >>> Tx fifo: cursize 0 nitems 3999999 has_event 0 >>> head 0 tail 0 segment manager 2 >>> vpp session 1 thread 0 app session 2 thread 0 >>> ooo pool 0 active elts newest 0 >>> Thread 0: active sessions 2 >>> >>> [root@orc01 vcl_test]# cat /etc/vpp/vcl.conf >>> vcl { >>> rx-fifo-size 4000000 >>> tx-fifo-size 4000000 >>> app-scope-local >>> app-scope-global >>> api-socket-name /tmp/vpp-api.sock >>> } >>> [root@orc01 vcl_test]# >>> >>> ------------------- Start of thread 0 vpp_main ------------------- >>> Packet 1 >>> >>> 00:09:53:445025: dpdk-input >>> HundredGigabitEthernet12/0/0 rx queue 0 >>> buffer 0x88078: current data 0, length 1516, buffer-pool 0, ref-count >>> 1, totlen-nifb 0, trace handle 0x0 >>> ext-hdr-valid >>> l4-cksum-computed l4-cksum-correct >>> PKT MBUF: port 0, nb_segs 1, pkt_len 1516 >>> buf_len 2176, data_len 1516, ol_flags 0x180, data_off 128, phys_addr >>> 0x75601e80 >>> packet_type 0x2e1 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 >>> rss 0x0 fdir.hi 0x0 fdir.lo 0x0 >>> Packet Offload Flags >>> PKT_RX_IP_CKSUM_GOOD (0x0080) IP cksum of RX pkt. is valid >>> PKT_RX_L4_CKSUM_GOOD (0x0100) L4 cksum of RX pkt. is valid >>> Packet Types >>> RTE_PTYPE_L2_ETHER (0x0001) Ethernet packet >>> RTE_PTYPE_L3_IPV6_EXT_UNKNOWN (0x00e0) IPv6 packet with or without >>> extension headers >>> RTE_PTYPE_L4_UDP (0x0200) UDP packet >>> IP6: b8:83:03:79:9f:e4 -> b8:83:03:79:af:8c 802.1q vlan 2001 >>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>> UDP: 56944 -> 8092 >>> length 1458, checksum 0xb22d >>> 00:09:53:445028: ethernet-input >>> frame: flags 0x3, hw-if-index 2, sw-if-index 2 >>> IP6: b8:83:03:79:9f:e4 -> b8:83:03:79:af:8c 802.1q vlan 2001 >>> 00:09:53:445029: ip6-input >>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>> UDP: 56944 -> 8092 >>> length 1458, checksum 0xb22d >>> 00:09:53:445031: ip6-lookup >>> fib 0 dpo-idx 6 flow hash: 0x00000000 >>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>> UDP: 56944 -> 8092 >>> length 1458, checksum 0xb22d >>> 00:09:53:445032: ip6-local >>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>> UDP: 56944 -> 8092 >>> length 1458, checksum 0xb22d >>> 00:09:53:445032: ip6-udp-lookup >>> UDP: src-port 56944 dst-port 8092 >>> 00:09:53:445033: udp6-input >>> UDP_INPUT: connection 0, disposition 5, thread 0 >>> >>> >>> thanks, >>> -Raj >>> >>> >>> On Wed, Jan 15, 2020 at 4:09 PM Raj Kumar via Lists.Fd.Io >>> <http://lists.fd.io/> <raj.gautam25=gmail....@lists.fd.io> wrote: >>> >>>> Hi Florin, >>>> Yes, [2] patch resolved the IPv6/UDP receiver issue. >>>> Thanks! for your help. >>>> >>>> thanks, >>>> -Raj >>>> >>>> On Tue, Jan 14, 2020 at 9:35 PM Florin Coras <fcoras.li...@gmail.com> >>>> wrote: >>>> >>>>> Hi Raj, >>>>> >>>>> First of all, with this [1], the vcl test app/client can establish a >>>>> udpc connection. Note that udp will most probably lose packets, so large >>>>> exchanges with those apps may not work. >>>>> >>>>> As for the second issue, does [2] solve it? >>>>> >>>>> Regards, >>>>> Florin >>>>> >>>>> [1] https://gerrit.fd.io/r/c/vpp/+/24332 >>>>> [2] https://gerrit.fd.io/r/c/vpp/+/24334 >>>>> >>>>> On Jan 14, 2020, at 12:59 PM, Raj Kumar <raj.gauta...@gmail.com> >>>>> wrote: >>>>> >>>>> Hi Florin, >>>>> Thanks! for the reply. >>>>> >>>>> I realized the issue with the non-connected case. For receiving >>>>> datagrams, I was using recvfrom() with DONOT_WAIT flag because of >>>>> that vppcom_session_recvfrom() api was failing. It expects either 0 or >>>>> MSG_PEEK flag. >>>>> if (flags == 0) >>>>> rv = vppcom_session_read (session_handle, buffer, buflen); >>>>> else if (flags & MSG_PEEK) 0x2 >>>>> rv = vppcom_session_peek (session_handle, buffer, buflen); >>>>> else >>>>> { >>>>> VDBG (0, "Unsupport flags for recvfrom %d", flags); >>>>> return VPPCOM_EAFNOSUPPORT; >>>>> } >>>>> >>>>> I changed the flag to 0 in recvfrom() , after that UDP rx is working >>>>> fine but only for IPv4. >>>>> >>>>> I am facing a different issue with IPv6/UDP receiver. I am >>>>> getting "no listener for dst port" error. >>>>> >>>>> Please let me know if I am doing something wrong. >>>>> Here are the traces : - >>>>> >>>>> [root@orc01 testcode]# VCL_DEBUG=2 LDP_DEBUG=2 >>>>> LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so >>>>> VCL_CONFIG=/etc/vpp/vcl.cfg ./udp6_rx >>>>> VCL<1164>: configured VCL debug level (2) from VCL_DEBUG! >>>>> VCL<1164>: allocated VCL heap = 0x7ff877439010, size 268435456 >>>>> (0x10000000) >>>>> VCL<1164>: configured rx_fifo_size 4000000 (0x3d0900) >>>>> VCL<1164>: configured tx_fifo_size 4000000 (0x3d0900) >>>>> VCL<1164>: configured app_scope_local (1) >>>>> VCL<1164>: configured app_scope_global (1) >>>>> VCL<1164>: configured api-socket-name (/tmp/vpp-api.sock) >>>>> VCL<1164>: completed parsing vppcom config! >>>>> vppcom_connect_to_vpp:549: vcl<1164:0>: app (ldp-1164-app) is >>>>> connected to VPP! >>>>> vppcom_app_create:1067: vcl<1164:0>: sending session enable >>>>> vppcom_app_create:1075: vcl<1164:0>: sending app attach >>>>> vppcom_app_create:1084: vcl<1164:0>: app_name 'ldp-1164-app', >>>>> my_client_index 0 (0x0) >>>>> ldp_init:209: ldp<1164>: configured LDP debug level (2) from env var >>>>> LDP_DEBUG! >>>>> ldp_init:282: ldp<1164>: LDP initialization: done! >>>>> ldp_constructor:2490: LDP<1164>: LDP constructor: done! >>>>> socket:974: ldp<1164>: calling vls_create: proto 1 (UDP), >>>>> is_nonblocking 0 >>>>> vppcom_session_create:1142: vcl<1164:0>: created session 0 >>>>> bind:1086: ldp<1164>: fd 32: calling vls_bind: vlsh 0, addr >>>>> 0x7fff9a93efe0, len 28 >>>>> vppcom_session_bind:1280: vcl<1164:0>: session 0 handle 0: binding to >>>>> local IPv6 address :: port 8092, proto UDP >>>>> vppcom_session_listen:1312: vcl<1164:0>: session 0: sending vpp listen >>>>> request... >>>>> vcl_session_bound_handler:610: vcl<1164:0>: session 0 [0x1]: listen >>>>> succeeded! >>>>> bind:1102: ldp<1164>: fd 32: returning 0 >>>>> >>>>> vpp# sh app server >>>>> Connection App >>>>> Wrk >>>>> [0:0][CT:U] :::8092->:::0 ldp-1164-app[shm] 0 >>>>> [#0][U] :::8092->:::0 ldp-1164-app[shm] 0 >>>>> >>>>> vpp# sh err >>>>> Count Node Reason >>>>> 7 dpdk-input no error >>>>> 2606 ip6-udp-lookup no listener for dst >>>>> port >>>>> 8 arp-reply ARP replies sent >>>>> 1 arp-disabled ARP Disabled on this >>>>> interface >>>>> 13 ip6-glean neighbor >>>>> solicitations sent >>>>> 2606 ip6-input valid ip6 packets >>>>> 4 ip6-local-hop-by-hop Unknown protocol ip6 >>>>> local h-b-h packets dropped >>>>> 2606 ip6-icmp-error destination >>>>> unreachable response sent >>>>> 40 ip6-icmp-input valid packets >>>>> 1 ip6-icmp-input neighbor >>>>> solicitations from source not on link >>>>> 12 ip6-icmp-input neighbor >>>>> solicitations for unknown targets >>>>> 1 ip6-icmp-input neighbor >>>>> advertisements sent >>>>> 1 ip6-icmp-input neighbor >>>>> advertisements received >>>>> 40 ip6-icmp-input router >>>>> advertisements sent >>>>> 40 ip6-icmp-input router >>>>> advertisements received >>>>> 1 ip4-icmp-input echo replies sent >>>>> 89 lldp-input lldp packets >>>>> received on disabled interfaces >>>>> 1328 llc-input unknown llc >>>>> ssap/dsap >>>>> vpp# >>>>> >>>>> vpp# show trace >>>>> ------------------- Start of thread 0 vpp_main ------------------- >>>>> Packet 1 >>>>> >>>>> 00:23:39:401354: dpdk-input >>>>> HundredGigabitEthernet12/0/0 rx queue 0 >>>>> buffer 0x8894e: current data 0, length 1516, buffer-pool 0, >>>>> ref-count 1, totlen-nifb 0, trace handle 0x0 >>>>> ext-hdr-valid >>>>> l4-cksum-computed l4-cksum-correct >>>>> PKT MBUF: port 0, nb_segs 1, pkt_len 1516 >>>>> buf_len 2176, data_len 1516, ol_flags 0x180, data_off 128, >>>>> phys_addr 0x75025400 >>>>> packet_type 0x2e1 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 >>>>> rss 0x0 fdir.hi 0x0 fdir.lo 0x0 >>>>> Packet Offload Flags >>>>> PKT_RX_IP_CKSUM_GOOD (0x0080) IP cksum of RX pkt. is valid >>>>> PKT_RX_L4_CKSUM_GOOD (0x0100) L4 cksum of RX pkt. is valid >>>>> Packet Types >>>>> RTE_PTYPE_L2_ETHER (0x0001) Ethernet packet >>>>> RTE_PTYPE_L3_IPV6_EXT_UNKNOWN (0x00e0) IPv6 packet with or >>>>> without extension headers >>>>> RTE_PTYPE_L4_UDP (0x0200) UDP packet >>>>> IP6: b8:83:03:79:9f:e4 -> b8:83:03:79:af:8c 802.1q vlan 2001 >>>>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>>>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>>>> UDP: 60593 -> 8092 >>>>> length 1458, checksum 0x0964 >>>>> 00:23:39:401355: ethernet-input >>>>> frame: flags 0x3, hw-if-index 2, sw-if-index 2 >>>>> IP6: b8:83:03:79:9f:e4 -> b8:83:03:79:af:8c 802.1q vlan 2001 >>>>> 00:23:39:401356: ip6-input >>>>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>>>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>>>> UDP: 60593 -> 8092 >>>>> length 1458, checksum 0x0964 >>>>> 00:23:39:401357: ip6-lookup >>>>> fib 0 dpo-idx 5 flow hash: 0x00000000 >>>>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>>>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>>>> UDP: 60593 -> 8092 >>>>> length 1458, checksum 0x0964 >>>>> 00:23:39:401361: ip6-local >>>>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>>>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>>>> UDP: 60593 -> 8092 >>>>> length 1458, checksum 0x0964 >>>>> 00:23:39:401362: ip6-udp-lookup >>>>> UDP: src-port 60593 dst-port 8092 (no listener) >>>>> 00:23:39:401362: ip6-icmp-error >>>>> UDP: fd0d:edc4:ffff:2001::201 -> fd0d:edc4:ffff:2001::203 >>>>> tos 0x00, flow label 0x0, hop limit 64, payload length 1458 >>>>> UDP: 60593 -> 8092 >>>>> length 1458, checksum 0x0964 >>>>> 00:23:39:401363: error-drop >>>>> rx:HundredGigabitEthernet12/0/0.2001 >>>>> 00:23:39:401364: drop >>>>> ip6-input: valid ip6 packets >>>>> >>>>> vpp# >>>>> >>>>> >>>>> Thanks, >>>>> -Raj >>>>> >>>>> >>>>> On Tue, Jan 14, 2020 at 1:44 PM Florin Coras <fcoras.li...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Raj, >>>>>> >>>>>> Session layer does support connection-less transports but udp does >>>>>> not raise accept notifications to vcl. UDPC might, but we haven’t tested >>>>>> udpc with vcl in a long time so it might not work properly. >>>>>> >>>>>> What was the problem you were hitting in the non-connected case? >>>>>> >>>>>> Regards, >>>>>> Florin >>>>>> >>>>>> > On Jan 14, 2020, at 7:13 AM, raj.gauta...@gmail.com wrote: >>>>>> > >>>>>> > Hi , >>>>>> > I am trying some host application tests ( using LD_PRELOAD) . TCP >>>>>> rx and tx both work fine. UDP tx also works fine. >>>>>> > The issue is only with UDP rx . In some discussion it was >>>>>> mentioned that session layer does not support connection-less transports >>>>>> so >>>>>> protocols like udp still need to accept connections and only afterwards >>>>>> read from the fifos. >>>>>> > So, I changed the UDP receiver application to use listen() and >>>>>> accept() before read() . But , I am still having issue to make it run. >>>>>> >>>>>> > After I started, udp traffic from other server it seems to accept >>>>>> the connection but never returns from the vppcom_session_accept() >>>>>> function. >>>>>> > VPP release is 19.08. >>>>>> > >>>>>> > vpp# sh app server >>>>>> > Connection App >>>>>> Wrk >>>>>> > [0:0][CT:U] 0.0.0.0:8090->0.0.0.0:0 ldp-36646-app[shm] >>>>>> 0 >>>>>> > [#0][U] 0.0.0.0:8090->0.0.0.0:0 ldp-36646-app[shm] >>>>>> 0 >>>>>> > vpp# >>>>>> > >>>>>> > >>>>>> > [root@orc01 testcode]# VCL_DEBUG=2 LDP_DEBUG=2 >>>>>> LD_PRELOAD=/opt/vpp/build-root/install-vpp-native/vpp/lib/libvcl_ldpreload.so >>>>>> VCL_CONFIG=/etc/vpp/vcl.cfg ./udp_rx >>>>>> > VCL<36646>: configured VCL debug level (2) from VCL_DEBUG! >>>>>> > VCL<36646>: allocated VCL heap = 0x7f77e5309010, size 268435456 >>>>>> (0x10000000) >>>>>> > VCL<36646>: configured rx_fifo_size 4000000 (0x3d0900) >>>>>> > VCL<36646>: configured tx_fifo_size 4000000 (0x3d0900) >>>>>> > VCL<36646>: configured app_scope_local (1) >>>>>> > VCL<36646>: configured app_scope_global (1) >>>>>> > VCL<36646>: configured api-socket-name (/tmp/vpp-api.sock) >>>>>> > VCL<36646>: completed parsing vppcom config! >>>>>> > vppcom_connect_to_vpp:549: vcl<36646:0>: app (ldp-36646-app) is >>>>>> connected to VPP! >>>>>> > vppcom_app_create:1067: vcl<36646:0>: sending session enable >>>>>> > vppcom_app_create:1075: vcl<36646:0>: sending app attach >>>>>> > vppcom_app_create:1084: vcl<36646:0>: app_name 'ldp-36646-app', >>>>>> my_client_index 0 (0x0) >>>>>> > ldp_init:209: ldp<36646>: configured LDP debug level (2) from env >>>>>> var LDP_DEBUG! >>>>>> > ldp_init:282: ldp<36646>: LDP initialization: done! >>>>>> > ldp_constructor:2490: LDP<36646>: LDP constructor: done! >>>>>> > socket:974: ldp<36646>: calling vls_create: proto 1 (UDP), >>>>>> is_nonblocking 0 >>>>>> > vppcom_session_create:1142: vcl<36646:0>: created session 0 >>>>>> > Socket successfully created.. >>>>>> > bind:1086: ldp<36646>: fd 32: calling vls_bind: vlsh 0, addr >>>>>> 0x7fff3f3c1040, len 16 >>>>>> > vppcom_session_bind:1280: vcl<36646:0>: session 0 handle 0: binding >>>>>> to local IPv4 address 0.0.0.0 port 8090, proto UDP >>>>>> > vppcom_session_listen:1312: vcl<36646:0>: session 0: sending vpp >>>>>> listen request... >>>>>> > vcl_session_bound_handler:610: vcl<36646:0>: session 0 [0x1]: >>>>>> listen succeeded! >>>>>> > bind:1102: ldp<36646>: fd 32: returning 0 >>>>>> > Socket successfully binded.. >>>>>> > listen:2005: ldp<36646>: fd 32: calling vls_listen: vlsh 0, n 5 >>>>>> > vppcom_session_listen:1308: vcl<36646:0>: session 0 [0x1]: already >>>>>> in listen state! >>>>>> > listen:2020: ldp<36646>: fd 32: returning 0 >>>>>> > Server listening.. >>>>>> > ldp_accept4:2043: ldp<36646>: listen fd 32: calling >>>>>> vppcom_session_accept: listen sid 0, ep 0x0, flags 0x3f3c0fc0 >>>>>> > vppcom_session_accept:1478: vcl<36646:0>: discarded event: 0 >>>>>> > >>>>>> > >>>>>> >>>>>> >>>>> >>> >>> >>> >> >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15250): https://lists.fd.io/g/vpp-dev/message/15250 Mute This Topic: https://lists.fd.io/mt/69694900/21656 Mute #vpp-hoststack: https://lists.fd.io/mk?hashtag=vpp-hoststack&subid=1480452 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-