Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-16 Thread Cong Wang
On Tue, May 16, 2017 at 10:53 AM, Cong Wang wrote: > How about adding a check for ->fib_dead in these work's? > If the last treeref is gone, probably it is no longer needed > to continue to do the offloading work. Hmm, this doesn't look correct either, we may miss an offloading work if fib_releas

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-16 Thread Cong Wang
On Tue, May 16, 2017 at 12:46 AM, Julian Anastasov wrote: > > Hello, > > On Mon, 15 May 2017, Cong Wang wrote: > >> On Mon, May 15, 2017 at 1:37 PM, Julian Anastasov wrote: >> > Any user that does not set FIB_LOOKUP_NOREF >> > will need nh_dev refcounts. The assumption is that the

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-16 Thread Julian Anastasov
Hello, On Mon, 15 May 2017, Cong Wang wrote: > On Mon, May 15, 2017 at 1:37 PM, Julian Anastasov wrote: > > Any user that does not set FIB_LOOKUP_NOREF > > will need nh_dev refcounts. The assumption is that the > > NHs are accessed, who knows, may be even after RCU grace > > per

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-15 Thread Cong Wang
On Mon, May 15, 2017 at 1:37 PM, Julian Anastasov wrote: > Any user that does not set FIB_LOOKUP_NOREF > will need nh_dev refcounts. The assumption is that the > NHs are accessed, who knows, may be even after RCU grace > period. As result, we can not use dev_put on NETDEV_UNREGISTER. > So,

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-15 Thread Julian Anastasov
Hello, On Mon, 15 May 2017, Cong Wang wrote: > On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov wrote: > > Now the main question: is FIB_LOOKUP_NOREF used > > everywhere in IPv4? I guess so. If not, it means > > someone can walk its res->fi NHs which is bad. I think, > > this w

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-15 Thread Cong Wang
On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov wrote: > Now the main question: is FIB_LOOKUP_NOREF used > everywhere in IPv4? I guess so. If not, it means > someone can walk its res->fi NHs which is bad. I think, > this will delay the unregistration for long time and we > can not solve

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-12 Thread Julian Anastasov
Hello, On Fri, 12 May 2017, Cong Wang wrote: > On Thu, May 11, 2017 at 11:39 PM, Julian Anastasov wrote: > > > > fib_flush will unlink the FIB infos at NETDEV_UNREGISTER > > time, so we can not see them in any hash tables later on > > NETDEV_UNREGISTER_FINAL. fib_put_nh_devs() c

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-12 Thread Cong Wang
On Fri, May 12, 2017 at 1:58 PM, Cong Wang wrote: > On Fri, May 12, 2017 at 10:27 AM, Cong Wang wrote: >> Or maybe don't touch nh_oif but using a new flag in >> nh_flags?? We only need to know if we should call >> dev_put() in free_fib_info_rcu(). >> >> Again, I am still not sure if it is any bet

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-12 Thread Cong Wang
On Fri, May 12, 2017 at 10:27 AM, Cong Wang wrote: > Or maybe don't touch nh_oif but using a new flag in > nh_flags?? We only need to know if we should call > dev_put() in free_fib_info_rcu(). > > Again, I am still not sure if it is any better than just > putting these fib_nh into a linked list. >

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-12 Thread Cong Wang
On Thu, May 11, 2017 at 9:55 PM, Eric Dumazet wrote: > On Thu, 2017-05-11 at 18:22 -0700, Cong Wang wrote: >> On Thu, May 11, 2017 at 5:07 PM, Cong Wang wrote: >> > So, if I understand you correctly it is safe to NULL'ing >> > nh_dev in NETDEV_UNREGISTER_FINAL, right? >> > >> > If still not, how

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-12 Thread Cong Wang
On Thu, May 11, 2017 at 11:39 PM, Julian Anastasov wrote: > > Hello, > > On Thu, 11 May 2017, Cong Wang wrote: > >> On Thu, May 11, 2017 at 5:07 PM, Cong Wang wrote: >> > So, if I understand you correctly it is safe to NULL'ing >> > nh_dev in NETDEV_UNREGISTER_FINAL, right? >> > >> > If s

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-11 Thread Julian Anastasov
Hello, On Thu, 11 May 2017, Cong Wang wrote: > On Thu, May 11, 2017 at 5:07 PM, Cong Wang wrote: > > So, if I understand you correctly it is safe to NULL'ing > > nh_dev in NETDEV_UNREGISTER_FINAL, right? > > > > If still not, how about transfer nh_dev's to loopback_dev > > too in NETDEV

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-11 Thread Eric Dumazet
On Thu, 2017-05-11 at 18:22 -0700, Cong Wang wrote: > On Thu, May 11, 2017 at 5:07 PM, Cong Wang wrote: > > So, if I understand you correctly it is safe to NULL'ing > > nh_dev in NETDEV_UNREGISTER_FINAL, right? > > > > If still not, how about transfer nh_dev's to loopback_dev > > too in NETDEV_UNR

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-11 Thread Cong Wang
On Thu, May 11, 2017 at 5:07 PM, Cong Wang wrote: > So, if I understand you correctly it is safe to NULL'ing > nh_dev in NETDEV_UNREGISTER_FINAL, right? > > If still not, how about transfer nh_dev's to loopback_dev > too in NETDEV_UNREGISTER? Like we transfer dst->dev. > > I don't want to touch th

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-11 Thread Cong Wang
On Wed, May 10, 2017 at 12:51 PM, Julian Anastasov wrote: > Oh, well, the sockets can hold cached dst. > But if the promise is that rt->fi is used only as > reference to metrics we have to find a way to drop > the dev references at NETDEV_UNREGISTER time. If you > set nh_dev to NULL then a

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-10 Thread Julian Anastasov
Hello, On Wed, 10 May 2017, Cong Wang wrote: > On Wed, May 10, 2017 at 12:38 AM, Julian Anastasov wrote: > > > > During NETDEV_UNREGISTER packets for dev should not > > be flying but packets for other devs can walk the nexthops > > for multipath routes. It is the rcu_barrier bef

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-10 Thread Cong Wang
On Wed, May 10, 2017 at 12:38 AM, Julian Anastasov wrote: > > Hello, > > On Tue, 9 May 2017, Cong Wang wrote: > >> > Also setting nexthop_nh->nh_dev to NULL looks quite dangerous >> > >> > We have plenty of sites doing : >> > >> > if (fi->fib_dev) >> > x = fi->fib_dev->field >> > >> >

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-10 Thread Cong Wang
On Tue, May 9, 2017 at 4:51 PM, Eric Dumazet wrote: > On Tue, 2017-05-09 at 16:35 -0700, Cong Wang wrote: > >> This statement is only used to ensure we pass the "dead == fi->fib_nhs" >> check right below the inner loop, it is fine to keep it without break since >> fi is not changed in the inner lo

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-10 Thread Cong Wang
On Tue, May 9, 2017 at 4:50 PM, Eric Dumazet wrote: > On Tue, 2017-05-09 at 16:35 -0700, Cong Wang wrote: > >> All of them take RCU read lock, so, as I explained in the code comment, >> they all should be fine because of synchronize_net() on unregister path. >> Do you see anything otherwise? > > T

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-10 Thread Julian Anastasov
Hello, On Tue, 9 May 2017, Cong Wang wrote: > > Also setting nexthop_nh->nh_dev to NULL looks quite dangerous > > > > We have plenty of sites doing : > > > > if (fi->fib_dev) > > x = fi->fib_dev->field > > > > fib_route_seq_show() is one example. > > > > All of them take RCU read lo

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Eric Dumazet
On Tue, 2017-05-09 at 16:35 -0700, Cong Wang wrote: > This statement is only used to ensure we pass the "dead == fi->fib_nhs" > check right below the inner loop, it is fine to keep it without break since > fi is not changed in the inner loop. > So the dead++ above wont end up with (dead > fi->fi

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Eric Dumazet
On Tue, 2017-05-09 at 16:35 -0700, Cong Wang wrote: > All of them take RCU read lock, so, as I explained in the code comment, > they all should be fine because of synchronize_net() on unregister path. > Do you see anything otherwise? They might take rcu lock, but compiler is still allowed to read

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Cong Wang
On Tue, May 9, 2017 at 4:09 PM, Eric Dumazet wrote: > On Tue, 2017-05-09 at 15:54 -0700, Eric Dumazet wrote: >> On Tue, 2017-05-09 at 15:52 -0700, Eric Dumazet wrote: >> > On Tue, 2017-05-09 at 15:07 -0700, Cong Wang wrote: >> > > On Tue, May 9, 2017 at 1:56 PM, Cong Wang >> > > wrote: >> > > >

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Eric Dumazet
On Tue, 2017-05-09 at 15:54 -0700, Eric Dumazet wrote: > On Tue, 2017-05-09 at 15:52 -0700, Eric Dumazet wrote: > > On Tue, 2017-05-09 at 15:07 -0700, Cong Wang wrote: > > > On Tue, May 9, 2017 at 1:56 PM, Cong Wang > > > wrote: > > > > Wait... if we transfer dst->dev to loopback_dev because we d

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Eric Dumazet
On Tue, 2017-05-09 at 15:52 -0700, Eric Dumazet wrote: > On Tue, 2017-05-09 at 15:07 -0700, Cong Wang wrote: > > On Tue, May 9, 2017 at 1:56 PM, Cong Wang wrote: > > > Wait... if we transfer dst->dev to loopback_dev because we don't > > > want to block unregister path, then we might have a similar

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Eric Dumazet
On Tue, 2017-05-09 at 15:07 -0700, Cong Wang wrote: > On Tue, May 9, 2017 at 1:56 PM, Cong Wang wrote: > > Wait... if we transfer dst->dev to loopback_dev because we don't > > want to block unregister path, then we might have a similar problem > > for rt->fi too, fib_info is still referenced by ds

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Cong Wang
On Tue, May 9, 2017 at 1:56 PM, Cong Wang wrote: > Wait... if we transfer dst->dev to loopback_dev because we don't > want to block unregister path, then we might have a similar problem > for rt->fi too, fib_info is still referenced by dst, so these nh_dev's still > hold the dev references... > I

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Cong Wang
On Tue, May 9, 2017 at 9:56 AM, Eric Dumazet wrote: > On Tue, 2017-05-09 at 09:44 -0700, Cong Wang wrote: > >> >> Eric, how did you produce it? >> I guess it's because of nh_dev which is the only netdevice pointer inside >> fib_info. Let me take a deeper look. >> > > Nothing particular, I am using

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Eric Dumazet
On Tue, 2017-05-09 at 09:44 -0700, Cong Wang wrote: > > Eric, how did you produce it? > I guess it's because of nh_dev which is the only netdevice pointer inside > fib_info. Let me take a deeper look. > Nothing particular, I am using kexec to boot new kernels, and all my attempts with your patc

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-09 Thread Cong Wang
On Mon, May 8, 2017 at 7:18 PM, Eric Dumazet wrote: > On Mon, 2017-05-08 at 21:22 -0400, David Miller wrote: >> From: Eric Dumazet >> Date: Mon, 08 May 2017 17:01:20 -0700 >> >> > On Mon, 2017-05-08 at 14:35 -0400, David Miller wrote: >> >> From: Cong Wang >> >> Date: Thu, 4 May 2017 14:54:17 -

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-08 Thread David Miller
From: Eric Dumazet Date: Mon, 08 May 2017 19:18:22 -0700 > Keeping fib (and their reference to netdev) is apparently too much, > we probably need to implement a refcount on the metrics themselves, > being stand alone objects. I think I see the problem, when we flush the dst cache on device unreg

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-08 Thread Eric Dumazet
On Mon, 2017-05-08 at 21:22 -0400, David Miller wrote: > From: Eric Dumazet > Date: Mon, 08 May 2017 17:01:20 -0700 > > > On Mon, 2017-05-08 at 14:35 -0400, David Miller wrote: > >> From: Cong Wang > >> Date: Thu, 4 May 2017 14:54:17 -0700 > >> > >> > IPv4 dst could use fi->fib_metrics to stor

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-08 Thread David Miller
From: Eric Dumazet Date: Mon, 08 May 2017 17:01:20 -0700 > On Mon, 2017-05-08 at 14:35 -0400, David Miller wrote: >> From: Cong Wang >> Date: Thu, 4 May 2017 14:54:17 -0700 >> >> > IPv4 dst could use fi->fib_metrics to store metrics but fib_info >> > itself is refcnt'ed, so without taking a re

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-08 Thread Eric Dumazet
On Mon, 2017-05-08 at 14:35 -0400, David Miller wrote: > From: Cong Wang > Date: Thu, 4 May 2017 14:54:17 -0700 > > > IPv4 dst could use fi->fib_metrics to store metrics but fib_info > > itself is refcnt'ed, so without taking a refcnt fi and > > fi->fib_metrics could be freed while dst metrics s

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-08 Thread David Miller
From: Cong Wang Date: Thu, 4 May 2017 14:54:17 -0700 > IPv4 dst could use fi->fib_metrics to store metrics but fib_info > itself is refcnt'ed, so without taking a refcnt fi and > fi->fib_metrics could be freed while dst metrics still points to > it. This triggers use-after-free as reported by An

Re: [Patch net] ipv4: restore rt->fi for reference counting

2017-05-07 Thread Eric Dumazet
On Thu, 2017-05-04 at 14:54 -0700, Cong Wang wrote: > IPv4 dst could use fi->fib_metrics to store metrics but fib_info > itself is refcnt'ed, so without taking a refcnt fi and > fi->fib_metrics could be freed while dst metrics still points to > it. This triggers use-after-free as reported by Andrey

[Patch net] ipv4: restore rt->fi for reference counting

2017-05-04 Thread Cong Wang
IPv4 dst could use fi->fib_metrics to store metrics but fib_info itself is refcnt'ed, so without taking a refcnt fi and fi->fib_metrics could be freed while dst metrics still points to it. This triggers use-after-free as reported by Andrey twice. This patch reverts commit 2860583fe840 ("ipv4: Kill