On Thu, 2017-09-21 at 11:06 +0200, Paweł Staszewski wrote: > > W dniu 2017-09-21 o 03:17, Eric Dumazet pisze: > > On Wed, 2017-09-20 at 18:09 -0700, Wei Wang wrote: > >>> Thanks very much Pawel for the feedback. > >>> > >>> I was looking into the code (specifically IPv4 part) and found that in > >>> free_fib_info_rcu(), we call free_nh_exceptions() without holding the > >>> fnhe_lock. I am wondering if that could cause some race condition on > >>> fnhe->fnhe_rth_input/output so a double call on dst_dev_put() on the > >>> same dst could be happening. > >>> > >>> But as we call free_fib_info_rcu() only after the grace period, and > >>> the lookup code which could potentially modify > >>> fnhe->fnhe_rth_input/output all holds rcu_read_lock(), it seems > >>> fine... > >>> > >> Hi Pawel, > >> > >> Could you try the following debug patch on top of net-next branch and > >> reproduce the issue check if there are warning msg showing? > >> > >> diff --git a/include/net/dst.h b/include/net/dst.h > >> index 93568bd0a352..82aff41c6f63 100644 > >> --- a/include/net/dst.h > >> +++ b/include/net/dst.h > >> @@ -271,7 +271,7 @@ static inline void dst_use_noref(struct dst_entry > >> *dst, unsigned long time) > >> static inline struct dst_entry *dst_clone(struct dst_entry *dst) > >> { > >> if (dst) > >> - atomic_inc(&dst->__refcnt); > >> + dst_hold(dst); > >> return dst; > >> } > >> > >> Thanks. > >> Wei > >> > > > > Yes, we believe skb_dst_force() and skb_dst_force_safe() should be > > unified (to the 'safe' version) > > > > We no longer have gc to protect from 0 -> 1 transition of dst refcount. > > > > > > > > > > After adding patch from Wei > https://bugzilla.kernel.org/show_bug.cgi?id=197005#c14 >
OK we have two problems here 1) We need to unify skb_dst_force() ( for net tree ) 2) Vlan devices should try to correctly handle IFF_XMIT_DST_RELEASE from lower device. This will considerably help your performance. For 1), this is what I had in mind, can you try it ? Thanks a lot ! diff --git a/include/net/dst.h b/include/net/dst.h index 93568bd0a3520bb7402f04d90cf04ac99c81cfbe..f23851eeaad917e8dafc06b58d23a2575405c894 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -271,7 +271,7 @@ static inline void dst_use_noref(struct dst_entry *dst, unsigned long time) static inline struct dst_entry *dst_clone(struct dst_entry *dst) { if (dst) - atomic_inc(&dst->__refcnt); + dst_hold(dst); return dst; } @@ -311,21 +311,6 @@ static inline void skb_dst_copy(struct sk_buff *nskb, const struct sk_buff *oskb __skb_dst_copy(nskb, oskb->_skb_refdst); } -/** - * skb_dst_force - makes sure skb dst is refcounted - * @skb: buffer - * - * If dst is not yet refcounted, let's do it - */ -static inline void skb_dst_force(struct sk_buff *skb) -{ - if (skb_dst_is_noref(skb)) { - WARN_ON(!rcu_read_lock_held()); - skb->_skb_refdst &= ~SKB_DST_NOREF; - dst_clone(skb_dst(skb)); - } -} - /** * dst_hold_safe - Take a reference on a dst if possible * @dst: pointer to dst entry @@ -356,6 +341,23 @@ static inline void skb_dst_force_safe(struct sk_buff *skb) } } +/** + * skb_dst_force - makes sure skb dst is refcounted + * @skb: buffer + * + * If dst is not yet refcounted, let's do it + */ +static inline void skb_dst_force(struct sk_buff *skb) +{ + if (skb_dst_is_noref(skb)) { + struct dst_entry *dst = skb_dst(skb); + + WARN_ON(!rcu_read_lock_held()); + if (!dst_hold_safe(dst)) + dst = NULL; + skb->_skb_refdst = (unsigned long)dst; + } +} /** * __skb_tunnel_rx - prepare skb for rx reinsert