Public bug reported: * Explain the feature Currently, drivers register to a ct zone that can be shared by multiple devices. This can be inefficient for the driver to offload, as it needs to handle all the cases where the tuple can come from, instead of where it's most likely will arive from.
For example, consider the following tc rules: tc filter add dev dev1 ... flower action ct commit zone 5 \ action mirred egress redirect dev dev2 tc filter add dev dev2 ... flower action ct zone 5 \ action goto chain chain 2 tc filter add dev dev2 ... flower ct_state +trk+est ... \ action mirred egress redirect dev dev1 Both dev2 and dev1 register to the zone 5 flow table (created by act_ct). A tuple originating on dev1, going to dev2, will be offloaded to both devices, and both will need to offload both directions, resulting in 4 total rules. The traffic will only hit originiating tuple on dev1, and reply tuple on dev2. By passing the originating device that created the connection with the tuple, dev1 can choose to offload only the originating tuple, and dev2 only the reply tuple. Resulting in a more efficient offload. The first patch adds an act_ct nf conntrack extension, to temporarily store the originiating device from the skb before offloading the connection once the connection is established. Once sent to offload, it fills the tuple originating device. The second patch get this information from tuples which pass in openvswitch. The third patch is Mellanox driver ct offload implementation using this information to provide a hint to firmware of where this offloaded tuple packets will arrive from (LOCAL or UPLINK port), and thus increase insertion rate. * How to test Create OVS bridge with 2 devices mlx5 rep devices. Enable HW offload and configure regular connection tracking OpenFlow rules: e.g: ovs-ofctl del-flows br-ovs ovs-ofctl add-flow br-ovs arp,actions=normal ovs-ofctl add-flow br-ovs "table=0, ip,ct_state=-trk actions=ct(table=1)" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+new actions=ct(commit),normal" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+est, actions=normal" With SW steering enabled and the above commits (and up to date ofed) tuple insertion rate should be about twice as fast. This can be seen via procfs hw offloaded count while running high traffic: while true; do res=`sudo cat /proc/net/nf_conntrack | grep -i offload` && echo "$res" && echo "$res" | wc -l; done * What it could break. NA ** Affects: linux-bluefield (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/1960575 Title: Pass originating device to drivers offloading ct connection so devices will filter the tuples and offload them more efficiently Status in linux-bluefield package in Ubuntu: New Bug description: * Explain the feature Currently, drivers register to a ct zone that can be shared by multiple devices. This can be inefficient for the driver to offload, as it needs to handle all the cases where the tuple can come from, instead of where it's most likely will arive from. For example, consider the following tc rules: tc filter add dev dev1 ... flower action ct commit zone 5 \ action mirred egress redirect dev dev2 tc filter add dev dev2 ... flower action ct zone 5 \ action goto chain chain 2 tc filter add dev dev2 ... flower ct_state +trk+est ... \ action mirred egress redirect dev dev1 Both dev2 and dev1 register to the zone 5 flow table (created by act_ct). A tuple originating on dev1, going to dev2, will be offloaded to both devices, and both will need to offload both directions, resulting in 4 total rules. The traffic will only hit originiating tuple on dev1, and reply tuple on dev2. By passing the originating device that created the connection with the tuple, dev1 can choose to offload only the originating tuple, and dev2 only the reply tuple. Resulting in a more efficient offload. The first patch adds an act_ct nf conntrack extension, to temporarily store the originiating device from the skb before offloading the connection once the connection is established. Once sent to offload, it fills the tuple originating device. The second patch get this information from tuples which pass in openvswitch. The third patch is Mellanox driver ct offload implementation using this information to provide a hint to firmware of where this offloaded tuple packets will arrive from (LOCAL or UPLINK port), and thus increase insertion rate. * How to test Create OVS bridge with 2 devices mlx5 rep devices. Enable HW offload and configure regular connection tracking OpenFlow rules: e.g: ovs-ofctl del-flows br-ovs ovs-ofctl add-flow br-ovs arp,actions=normal ovs-ofctl add-flow br-ovs "table=0, ip,ct_state=-trk actions=ct(table=1)" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+new actions=ct(commit),normal" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+est, actions=normal" With SW steering enabled and the above commits (and up to date ofed) tuple insertion rate should be about twice as fast. This can be seen via procfs hw offloaded count while running high traffic: while true; do res=`sudo cat /proc/net/nf_conntrack | grep -i offload` && echo "$res" && echo "$res" | wc -l; done * What it could break. NA To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1960575/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp