Hi, Florin, Attached is the some active session live after echo client timeout.
[#0][T] 10.10.1.1:1111->10.10.1.2:61337 ESTABLISHED flags: Recovery timers: [] snd_una 4285 snd_nxt 4285 snd_una_max 4285 rcv_nxt 8193 rcv_las 8193 snd_wnd 65536 rcv_wnd 65536 snd_wl1 8193 snd_wl2 4285 flight size 0 out space 2856 cc space 2856 rcv_wnd_av 65536 cong recovery cwnd 2856 ssthresh 2856 rtx_bytes 0 bytes_acked 0 prev_ssthresh 524288 snd_congestion 8193 dupack 0 limited_transmit 1746905375 tsecr 1714832397 tsecr_last_ack 1714832397 rto 200 rto_boff 0 srtt 6 rttvar 2 rtt_ts 0 rtt_seq 2548066206 tsval_recent 81891189 tsval_recent_age 13028 scoreboard: sacked_bytes 0 last_sacked_bytes 0 lost_bytes 0 last_bytes_delivered 0 high_sacked 2548066206 snd_una_adv 0 cur_rxt_hole 4294967295 high_rxt 0 rescue_rxt 0 Rx fifo: cursize 0 nitems 65536 has_event 0 head 8192 tail 8192 segment manager 2 vpp session 4 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 Tx fifo: cursize 3908 nitems 65536 has_event 0 head 4284 tail 8192 segment manager 2 vpp session 4 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 [#0][T] 10.10.1.1:1111->10.10.1.2:2210 ESTABLISHED flags: Recovery timers: [] snd_una 4285 snd_nxt 4285 snd_una_max 4285 rcv_nxt 8193 rcv_las 8193 snd_wnd 65536 rcv_wnd 65536 snd_wl1 8193 snd_wl2 4285 flight size 0 out space 2856 cc space 2856 rcv_wnd_av 65536 cong recovery cwnd 2856 ssthresh 2856 rtx_bytes 0 bytes_acked 0 prev_ssthresh 524288 snd_congestion 8193 dupack 0 limited_transmit 1746905375 tsecr 1714832397 tsecr_last_ack 1714832397 rto 200 rto_boff 0 srtt 6 rttvar 2 rtt_ts 0 rtt_seq 2548066206 tsval_recent 81891189 tsval_recent_age 13028 scoreboard: sacked_bytes 0 last_sacked_bytes 0 lost_bytes 0 last_bytes_delivered 0 high_sacked 2548066206 snd_una_adv 0 cur_rxt_hole 4294967295 high_rxt 0 rescue_rxt 0 Rx fifo: cursize 0 nitems 65536 has_event 0 head 8192 tail 8192 segment manager 2 vpp session 5 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 Tx fifo: cursize 3908 nitems 65536 has_event 0 head 4284 tail 8192 segment manager 2 vpp session 5 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 [#0][T] 10.10.1.1:1111->10.10.1.2:30199 ESTABLISHED flags: Recovery timers: [] snd_una 5713 snd_nxt 5713 snd_una_max 5713 rcv_nxt 8193 rcv_las 8193 snd_wnd 65536 rcv_wnd 65536 snd_wl1 8193 snd_wl2 5713 flight size 0 out space 2856 cc space 2856 rcv_wnd_av 65536 cong recovery cwnd 2856 ssthresh 2856 rtx_bytes 0 bytes_acked 0 prev_ssthresh 524288 snd_congestion 8193 dupack 0 limited_transmit 1748569900 tsecr 1714832397 tsecr_last_ack 1714832397 rto 200 rto_boff 0 srtt 6 rttvar 1 rtt_ts 0 rtt_seq 2546405589 tsval_recent 81891189 tsval_recent_age 13028 scoreboard: sacked_bytes 0 last_sacked_bytes 0 lost_bytes 0 last_bytes_delivered 0 high_sacked 2546403109 snd_una_adv 0 cur_rxt_hole 4294967295 high_rxt 0 rescue_rxt 0 Rx fifo: cursize 0 nitems 65536 has_event 0 head 8192 tail 8192 segment manager 2 vpp session 30 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 Tx fifo: cursize 2480 nitems 65536 has_event 0 head 5712 tail 8192 segment manager 2 vpp session 30 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 [#0][T] 10.10.1.1:1111->10.10.1.2:48888 ESTABLISHED flags: Recovery timers: [] snd_una 5713 snd_nxt 5713 snd_una_max 5713 rcv_nxt 8193 rcv_las 8193 snd_wnd 65536 rcv_wnd 65536 snd_wl1 8193 snd_wl2 5713 flight size 0 out space 2856 cc space 2856 rcv_wnd_av 65536 cong recovery cwnd 2856 ssthresh 2856 rtx_bytes 0 bytes_acked 0 prev_ssthresh 524288 snd_congestion 8193 dupack 0 limited_transmit 1748569900 tsecr 1714832397 tsecr_last_ack 1714832397 rto 200 rto_boff 0 srtt 6 rttvar 1 rtt_ts 0 rtt_seq 2546405589 tsval_recent 81891189 tsval_recent_age 13028 scoreboard: sacked_bytes 0 last_sacked_bytes 0 lost_bytes 0 last_bytes_delivered 0 high_sacked 2546403109 snd_una_adv 0 cur_rxt_hole 4294967295 high_rxt 0 rescue_rxt 0 Rx fifo: cursize 0 nitems 65536 has_event 0 head 8192 tail 8192 segment manager 2 vpp session 31 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 Tx fifo: cursize 2480 nitems 65536 has_event 0 head 5712 tail 8192 segment manager 2 vpp session 31 thread 0 app session 0 thread 0 ooo pool 0 active elts newest 4294967295 From: Florin Coras [mailto:fcoras.li...@gmail.com] Sent: Tuesday, August 21, 2018 11:18 PM To: Yu, Ping <ping...@intel.com> Cc: vpp-dev@lists.fd.io; Florin Coras (fcoras) <fco...@cisco.com> Subject: Re: [vpp-dev] TCP retransmission Hi Ping, Just from this, I can’t tell precisely what the issue is. Once the test hangs, do “show session verbose 2” and you should get the detailed status of the sessions that froze. I’ll take a look at those once I’m back in the office. Retransmissions should work fine but one never knows. On the other hand, the echo apps may be missing some io events. Did vpp report any other error message? Florin On Aug 21, 2018, at 7:14 AM, Yu, Ping <ping...@intel.com<mailto:ping...@intel.com>> wrote: Hello, all I am now debugging an issue observed in echo server/client. If nclients == small number, then everything goes smoothly, but if nclients is big, such as 500, 1000, echo client will hang without printing final report, and debug shows that echo client is waiting for response from server as it is expecting server to send bytes_to_receive to client. The problems occurs in both TCP and TLS protocol. if (sp->bytes_to_receive > 0) { delete_session = 0; } My assumption is that due to the big traffic, some packet is missing and re-transmission does not work well. Below the info for normal case, where nclients == 100. DBGvpp# show error Count Node Reason 600 session-queue Packets transmitted 100 tcp4-rcv-process Pure ACKs received 100 tcp4-rcv-process FINs received 600 tcp4-established Packets pushed into rx fifo 600 tcp4-established Pure ACKs received 32 ip4-glean address overflow drops 1 ip4-glean ARP requests sent 1 arp-input ARP request IP4 source address learned Server: DBGvpp# show error Count Node Reason 600 session-queue Packets transmitted 100 tcp4-listen SYNs received 200 tcp4-rcv-process Pure ACKs received 600 tcp4-established Packets pushed into rx fifo 600 tcp4-established Pure ACKs received 100 tcp4-established FINs received 1 arp-input ARP replies sent Below is the info when nclients == 500. Client: DBGvpp# show error Count Node Reason 3104 session-queue Packets transmitted 498 tcp4-rcv-process Pure ACKs received 498 tcp4-rcv-process FINs received 2998 tcp4-established Packets pushed into rx fifo 2892 tcp4-established Pure ACKs received 106 tcp4-established Duplicate ACK Server: DBGvpp# show error Count Node Reason 2998 session-queue Packets transmitted 500 tcp4-listen SYNs received 998 tcp4-rcv-process Pure ACKs received 2996 tcp4-established Packets pushed into rx fifo/ 2 tcp4-established OOO packets pushed into rx fifo 2476 tcp4-established Pure ACKs received 498 tcp4-established FINs received We can see that in clients side, packets transmission is 3104, which is bigger than normal case 600*5, but in the server side, the recived packet is only 2996, which is below 3000, which causes only 498 FIN is sent from client, so 2 live session makes “test echo” hang until timeout. Thanks Ping -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10233): https://lists.fd.io/g/vpp-dev/message/10233 Mute This Topic: https://lists.fd.io/mt/24876529/675152 Group Owner: vpp-dev+ow...@lists.fd.io<mailto:vpp-dev+ow...@lists.fd.io> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [fcoras.li...@gmail.com<mailto:fcoras.li...@gmail.com>] -=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10264): https://lists.fd.io/g/vpp-dev/message/10264 Mute This Topic: https://lists.fd.io/mt/24876529/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-