On Tue, Mar 13, 2018 at 05:07:22PM +0200, Liran Alon wrote: > Before this commit, dev_forward_skb() always cleared packet's > per-network-namespace info. Even if the packet doesn't cross > network namespaces. > > The comment above dev_forward_skb() describes that this is done > because the receiving device may be in another network namespace. > However, this case can easily be tested for and therefore we can > scrub packet's per-network-namespace info only when receiving device > is indeed in another network namespace. > > Therefore, this commit changes ____dev_forward_skb() to tell > skb_scrub_packet() that skb has crossed network-namespace only in case > transmitting device (skb->dev) network namespace is different then > receiving device (dev) network namespace. > > An example of a netdev that use skb_forward_skb() is veth. > Thus, before this commit a packet transmitted from one veth peer to > another when both veth peers are on same network namespace will lose > it's skb->mark. The bug could easily be demonstrated by the following: > > ip netns add test > ip netns exec test bash > ip link add veth-a type veth peer name veth-b > ip link set veth-a up > ip link set veth-b up > ip addr add dev veth-a 12.0.0.1/24 > tc qdisc add dev veth-a root handle 1 prio > tc qdisc add dev veth-b ingress > tc filter add dev veth-a parent 1: u32 match u32 0 0 action skbedit mark 1337 > tc filter add dev veth-b parent ffff: basic match 'meta(nf_mark eq 1337)' > action simple "skb->mark 1337!" > dmesg -C > ping 12.0.0.2 > dmesg > > Before this change, the above will print nothing to dmesg. > After this change, "skb->mark 1337!" will be printed as necessary.
Hi Liran, > > Signed-off-by: Liran Alon <liran.a...@oracle.com> > Reviewed-by: Yuval Shaia <yuval.sh...@oracle.com> > Signed-off-by: Yuval Shaia <yuval.sh...@oracle.com> I did not earned the credits for SOB, only r-b. Yuval > --- > include/linux/netdevice.h | 2 +- > net/core/dev.c | 6 +++--- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 5eef6c8e2741..5908f1e31ee2 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -3371,7 +3371,7 @@ static __always_inline int ____dev_forward_skb(struct > net_device *dev, > return NET_RX_DROP; > } > > - skb_scrub_packet(skb, true); > + skb_scrub_packet(skb, !net_eq(dev_net(dev), dev_net(skb->dev))); > skb->priority = 0; > return 0; > } > diff --git a/net/core/dev.c b/net/core/dev.c > index 2cedf520cb28..087787dd0a50 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -1877,9 +1877,9 @@ int __dev_forward_skb(struct net_device *dev, struct > sk_buff *skb) > * start_xmit function of one device into the receive queue > * of another device. > * > - * The receiving device may be in another namespace, so > - * we have to clear all information in the skb that could > - * impact namespace isolation. > + * The receiving device may be in another namespace. > + * In that case, we have to clear all information in the > + * skb that could impact namespace isolation. > */ > int dev_forward_skb(struct net_device *dev, struct sk_buff *skb) > { > -- > 1.9.1 >