On Mon, Dec 10, 2018 at 8:21 AM Florian Westphal <f...@strlen.de> wrote: > > The (out-of-tree) Multipath-TCP implementation needs to map logical mptcp > sequence numbers to the tcp sequence numbers used by individual subflows. > This DSS mapping is read/written from tcp option space on receive and > written to tcp option space on transmitted tcp packets that are part of > and MPTCP connection. > > Increasing skb->cb[] size in mainline to store the DSS mapping > is a non-starter for memory and and performance reasons > (f.e. increase in cb size also moves several frequently-accessed fields > to other cache lines). > > Extend skb_shared_info or adding a private data field to skb fclones > doesn't work for incoming skb, so a different DSS propagation method > would be required for the receive side. > > The current MPTCP implementation adds an additional mptcp specific > pointer to sk_buff. > > This series adds an extension infrastructure for sk_buff instead: > 1. extension memory is released when the sk_buff is free'd. > 2. data is shared after cloning an skb. > 3. adding extension to an skb will COW the extension buffer if needed. > > This is also how xfrm and bridge_nf extra data (skb->sp, skb->nf_bridge) > are handled. > > MPTCP could then add a new SKB_EXT_MPTCP_DSS (or similar) to store the > mapping for tx and rx processing. > > Two new members are added to sk_buff: > 1. 'active_extensions' byte (filling a hole), telling which extensions > are available for this skb. > 2. extension pointer, located at the end of the sk_buff. > If the active_extensions byte is 0, the pointer is undefined. > > Third patch converts nf_bridge to use the extension infrastructure: > The 'nf_bridge' pointer is removed, i.e. sk_buff size remains the same. > > After this, there are a few preparation patches to reduce "skb->sp" > usage by using the secpath helper functions instead. > > Last patch converts skb->sp, secpath information gets stored as > new SKB_EXT_SEC_PATH, so the 'sp' pointer is removed from skbuff. > > Extra code added to skb clone and free paths (to deal with refcount/free > of extension area) replace the existing code that does the same for > skb->nf_bridge and skb->secpath. > > I don't see any other in-tree users that could benefit from this > infrastructure, it doesn't make sense to add an extension just for the sake > of a single flag bit (like skb->nf_trace). > > Changes since RFC: > > Convert secpath. > > Unlike nf_bridge, the secpath struct needs to hold reference on the > xfrm state structure(s), thus handling gets more complicated when > an existing secpath extension has to be COW'd (we need to take additional > reference count on the xfrm states contained in the new copy). > > Florian Westphal (13): > netfilter: avoid using skb->nf_bridge directly > sk_buff: add skb extension infrastructure > net: convert bridge_nf to use skb extension infrastructure > xfrm: change secpath_set to return secpath struct, not error value > net: move secpath_exist helper to sk_buff.h > net: use skb_sec_path helper in more places > drivers: net: intel: use secpath helpers in more places > drivers: net: ethernet: mellanox: use skb_sec_path helper > drivers: net: netdevsim: use skb_sec_path helper > xfrm: use secpath_exist where applicable > drivers: chelsio: use skb_sec_path helper > xfrm: prefer secpath_set over secpath_dup > net: switch secpath to use skb extension infrastructure > > Documentation/networking/xfrm_device.txt | 7 > drivers/crypto/chelsio/chcr_ipsec.c | 4 > drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 15 > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 5 > drivers/net/ethernet/intel/ixgbevf/ipsec.c | 15 > drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 > drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c | 19 - > drivers/net/netdevsim/ipsec.c | 7 > include/linux/netfilter_bridge.h | 33 + > include/linux/skbuff.h | 160 +++++++- > include/net/netfilter/br_netfilter.h | 14 > include/net/xfrm.h | 40 -- > net/Kconfig | 4 > net/bridge/br_netfilter_hooks.c | 39 -- > net/bridge/br_netfilter_ipv6.c | 4 > net/core/skbuff.c | 182 > +++++++++- > net/ipv4/esp4.c | 9 > net/ipv4/esp4_offload.c | 15 > net/ipv4/ip_output.c | 1 > net/ipv4/netfilter/nf_reject_ipv4.c | 6 > net/ipv6/esp6.c | 9 > net/ipv6/esp6_offload.c | 15 > net/ipv6/ip6_output.c | 1 > net/ipv6/netfilter/nf_reject_ipv6.c | 10 > net/ipv6/xfrm6_input.c | 8 > net/netfilter/nf_log_common.c | 20 - > net/netfilter/nf_queue.c | 50 +- > net/netfilter/nfnetlink_queue.c | 23 - > net/netfilter/nft_meta.c | 2 > net/netfilter/nft_xfrm.c | 2 > net/netfilter/xt_physdev.c | 2 > net/netfilter/xt_policy.c | 2 > net/xfrm/Kconfig | 1 > net/xfrm/xfrm_device.c | 4 > net/xfrm/xfrm_input.c | 76 +--- > net/xfrm/xfrm_interface.c | 2 > net/xfrm/xfrm_output.c | 7 > net/xfrm/xfrm_policy.c | 19 - > security/selinux/xfrm.c | 4 > 39 files changed, 553 insertions(+), 285 deletions(-) >
For the changes in ixgbe and netdevsim: Acked-by: Shannon Nelson <shannon.lee.nel...@gmail.com> -- ============================================== Mr. Shannon Nelson Parents can't afford to be squeamish.