> > > On 9. Dec 2024, at 10:50, Pavel Vazharov <pa...@x3me.net> wrote: > > > > Hi there, > > > > We are using the network stack of FreeBSD 13 on top of DPDK in our > > application. > > During the last tests in the lab I stumbled upon the following situation: > > 1. It's a test where 5000 parallel connections are opened by Apache > > Bench and each one downloads 1MB data. It causes the client NIC to > > start dropping packets due to overflows which is intentional behavior. > > 2. The server side is our application with the FreeBSD stack. The > > client side Ubuntu 24.04 with Linux 6.8.0. > > 3. So, a connection is opened and the download starts on it. At some > > point the first drops occur and according to the TCP dump, from the > > client side, they take a few seconds before the connection heals up. > > However, these drops lead to increased values of t_srtt, t_rttvar and > > thus to increased value of t_rxtcur. > Do you observe increased values of t_rxtcur due to exponential backoff > or due to extreme values of t_srtt and t_rttvar?
I think it's a cumulative effect from both retransmits that happen and the finally received ACK packet. For example, from the client side pcap it's can be seen that in the period 17-20 second (105-107th packet) there are lost packets and the client stack resends the ACK packet with the same timestamp of the echo reply: 3145161949. On the server side there are 4 retransmits from this time (packets 546-549) and then the ACK packet from the client is received - packet 550. In the FreeBSD stack the retransmits trigger the back-off logic and this line in `tcp_timer_rexmt` TCPT_RANGESET(tp->t_rxtcur, rexmt, tp->t_rttmin, TCPTV_REXMTMAX); sets values of: 32, 64, 128, 256. Then the ACK packet is received and `tcp_xmit_timer` sets the following values: rtt:326 t_rxtcur:998 TCP_REXMTVAL(tp):978 t_rttmin:3 t_srtt:10432 t_rttvar:2608 due to the echo reply timestamp. The next received ACK packets with rtt:3 lead to values of t_rxtcur: 1118 - 1163 - 1157. And then the final retransmitted packet (packet 736 from the server pcap) leads to t_rxtcur: 2274. The FreeBSD stack is setup with 100Hz clock (per my understanding t_rxtcur is in ticks i.e. 2274 ticks are roughly equal to 22 seconds, if I'm not mistaken). The keep-alives happened at 35 and 50-th seconds and then the client gave up at 90-th second. > > > > 4. The window opens again up to 100-200 KB with lots of packets > > in-flight and the drops start again. They cause the re-transmit timer > > from the FreeBSD side to be started but with an interval of something > > like 18-20 seconds (according to my printf debugging on this side). > > 5. At the same time the TCP keep-alive timer is also started for the > > same connection (it's enabled for all connections) with a timeout of > > 15 seconds. > > 6. Nothing happens on this connection for the next 15 seconds. I'm not > > sure why the Linux stack didn't send any "wake-up" ACK packets or > > something but the tcpdump from the client side shows full silence > > between 14-th and 29-th second. > > 7. Next the FreeBSD keep-alive logic kicks-in and sends an ACK packet > > which is ACK-ed by the Linux stack immediately. However, this ACK > > packet received by the FreeBSD stack leads to restart of the > > retransmit timer and with the interval which is bigger than the > > keep-alive interval. > > 8. Point 6 and 7 repeat one more time before the apache bench client > > gives up on this connection and declares that it's timed-out. My > > understanding is that the connection can "loop" in 6-7 for a very long > > time and a packet with data will never be retransmitted. > Can you provide a .pcap file? I did a new test today to have pcap files from both sides and my explanations above are related to this new test. I'm attaching the .pcap files from the server as well as from the client side. Note that the capture size for each packet was limited to 80 bytes. If by some reason, the attached files are dropped from this email the same pcap files can be downloaded from the following link: https://drive.google.com/drive/folders/1418Qdc3E3ptjo2VcPXA6jo-4b5n-fYOb?usp=sharing > > Best regards > Michael Thank you for the help. > > 9. As far as I debugged the situation from the FreeBSD side the > > restart of the retransmit timer happens in the code after the > > `process_ACK` label, in the else branch here: > > ``` > > if (th->th_ack == tp->snd_max) { > > tcp_timer_activate(tp, TT_REXMT, 0); > > needoutput = 1; > > } else if (!tcp_timer_active(tp, TT_PERSIST)) > > tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); > > ``` > > > > So, based on the above situation I've the following questions: > > 1. Would it be correct if the re-transmit timer is not restarted by > > keep-alive ACK packets? > > 2. Assuming that the above change won't break anything else, is there > > a way for detecting that an ACK packet acknowledges previously sent > > keep-alive packet? > > > > Regards, > > Pavel. > > >
client43-test.pcap
Description: application/vnd.tcpdump.pcap
server43-test.pcap
Description: application/vnd.tcpdump.pcap