On Wed, Dec 20, 2017 at 12:12 AM, David Miller <da...@davemloft.net> wrote: > From: Xin Long <lucien....@gmail.com> > Date: Mon, 18 Dec 2017 14:20:56 +0800 > >> Unlike ip tunnels, now vxlan doesn't do any pmtu update for >> upper dst pmtu, even if it doesn't match the lower dst pmtu >> any more. >> >> The problem can be reproduced when reducing the vxlan lower >> dev's pmtu when running netperf. In jianlin's testing, the >> performance went to 1/7 of the previous. >> >> This patch is to update the upper dst pmtu to match the lower >> dst pmtu on tx path so that packets can be sent out even when >> lower dev's pmtu has been changed. >> >> It also works for metadata dst. >> >> Note that this patch doesn't process any pmtu icmp packet. >> But even in the future, the support for pmtu icmp packets >> process of udp tunnels will also needs this. >> >> The same thing will be done for geneve in another patch. >> >> Signed-off-by: Xin Long <lucien....@gmail.com> > > Yikes... > > You're going to have to find a way to fix this without > invoking ->update_pmtu() on every single transmit. That's > really excessive, especially for an operation which is > going to be a NOP %99.9999 of the time. understand, I couldn't find a better way, and all iptunnels are doing it in this way.
Or is it possible to go with an unlikely here ? if (unlikely(skb_dst(skb) && mtu < dst_mtu(skb_dst(skb)))) skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu); > > We need some way, instead, for the MTU change event to propagate > properly. I know this might be hard, but doing this in the transmit > handler on every packet to deal with it is not the way to go. how about doing it in vxlan_get_route(): @@ -1896,6 +1896,13 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan, struct net_device *saddr = fl4.saddr; if (use_cache) dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr); + + if (skb_dst(skb)) { + int mtu = dst_mtu(ndst) - VXLAN_HEADROOM; + + skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, + skb, mtu); + } This would do it only when no dst_cache and it has to do real route lookup. Note that even when update_pmtu is hit, mostly it will do nothing and just return as usually new mtu >= skb_dst(skb)'s pmtu. > > Thanks. >