Public bug reported: * Explain the feature
This patch addresses three possible problems: 1. ct gc may race to undo the timeout adjustment of the packet path, leaving the conntrack entry in place with the internal offload timeout (one day). 2. ct gc removes the ct because the IPS_OFFLOAD_BIT is not set and the CLOSE timeout is reached before the flow offload del. 3. tcp ct is always set to ESTABLISHED with a very long timeout in flow offload teardown/delete even though the state might be already CLOSED. Also as a remark we cannot assume that the FIN or RST packet is hitting flow table teardown as the packet might get bumped to the slow path in nftables. This patch resets IPS_OFFLOAD_BIT from flow_offload_teardown(), so conntrack handles the tcp rst/fin packet which triggers the CLOSE/FIN state transition. Moreover, return the connection's ownership to conntrack upon teardown by clearing the offload flag and fixing the established timeout value. The flow table GC thread will asynchonrnously free the flow table and hardware offload entries. Before this patch, the IPS_OFFLOAD_BIT remained set for expired flows on which is also misleading since the flow is back to classic conntrack path. If nf_ct_delete() removes the entry from the conntrack table, then it calls nf_ct_put() which decrements the refcnt. This is not a problem because the flowtable holds a reference to the conntrack object from flow_offload_alloc() path which is released via flow_offload_free(). This patch also updates nft_flow_offload to skip packets in SYN_RECV state. Since we might miss or bump packets to slow path, we do not know what will happen there while we are still in SYN_RECV, this patch postpones offload up to the next packet which also aligns to the existing behaviour in tc-ct. flow_offload_teardown() does not reset the existing tcp state from flow_offload_fixup_tcp() to ESTABLISHED anymore, packets bump to slow path might have already update the state to CLOSE/FIN. * How to test Adding the following flows to the OVS bridge in DPU OS: # ovs-ofctl add-flow ovsbr1 "table=0, ip,ct_state=-trk, actions=ct(table=1)" # ovs-ofctl add-flow ovsbr1 "table=1, ip,ct_state=+new, actions=ct(commit),normal" # ovs-ofctl add-flow ovsbr1 "table=1, ip,ct_state=-new, actions=normal" Start netserver on SUT: # netserver -p 5007 Start multiple TCP_CRR tests on peer: # count=1;while [ $count -lt 10 ]; do screen -d -m netperf -t TCP_CRR -H 11.0.0.2 -l 360 -- -r 1 -O " MIN_LAETENCY, MAX_LATENCY, MEAN_LATENCY, P90_LATENCY, P99_LATENCY ,P999_LATENCY,P9999_LATENCY,STDDEV_LATENCY ,THROUGHPUT ,THROUGHPUT_UNITS "; count=`expr $count + 1`; done A huge number of connections will be established and tear down. After the tests, some of them are not aged out: # From /proc/net/nf_conntrack in DPU OS ipv4 2 tcp 6 86354 LAST_ACK src=11.0.0.1 dst=11.0.0.2 sport=35862 dport=46797 src=11.0.0.2 dst=11.0.0.1 sport=46797 dport=35862 [ASSURED] mark=0 zone=0 use=2 ipv4 2 tcp 6 86354 LAST_ACK src=11.0.0.1 dst=11.0.0.2 sport=35862 dport=46797 src=11.0.0.2 dst=11.0.0.1 sport=46797 dport=35862 [ASSURED] mark=0 zone=0 use=2 ipv4 2 tcp 6 86354 LAST_ACK src=11.0.0.1 dst=11.0.0.2 sport=35862 dport=46797 src=11.0.0.2 dst=11.0.0.1 sport=46797 dport=35862 [ASSURED] mark=0 zone=0 use=2 The issue is usually reproduced after running the for several times. * What it could break. N/A ** Affects: linux-bluefield (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1975649 Title: flowtable: fix TCP flow teardown To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1975649/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs