Eric Dumazet <eric.duma...@gmail.com> wrote:
> On 12/10/2018 06:49 AM, Florian Westphal wrote:
> > The (out-of-tree) Multipath-TCP implementation needs a significant amount
> > of extra space in the skb control buffer.
> 
> Which skbs ? Input or output path ?

Both.

> > This work adds an extension infrastructure for sk_buff:
> > 1. extension memory is released when the sk_buff is free'd.
> > 2. data is shared after cloning an skb.
> > 
> 
> This seems additional atomic increments and decrements all over the places,
> and code bloat for a very precise reason :

No, it replaces the atomic dec/inc of nf_bridge and secpath.
If skb passes neither bridge netfilter nor ipsec, no atomic op is added.

If it passes either or, one atomic inc and dec is done, just like
current kernel (skb->sp or skb->nf_bridge refcount).

If it passes both (unlikely but possible) its now one instead of two,
or two (if one operation happens in different netns for instance, due to
skb scrubbing).

>       skb->cb[] is too small.
> 
> We do not want to increase skb->cb[] for two reasons, the first one being the 
> killer.
> 
> 1) we clear it at skb allocation, and copy it at skb cloning.

Right.

> 2) extra memory cost.
> 
> Why can't we have another skb->cb2[] field that is not cleared/copied by skb 
> functions at all ?
> 
> Each layer using skb->cb2[] would be responsible to fully manage it.

We could do that, its probably enough for mptcp needs.
This would keep nf_bridge and secpath pointers as-is and increase
skb truesize.

If you prefer that, ok, but I don't see why we can't unify them behind
a single layer?

Reply via email to