On Tue, 2017-05-09 at 09:44 -0700, Cong Wang wrote: > > Eric, how did you produce it? > I guess it's because of nh_dev which is the only netdevice pointer inside > fib_info. Let me take a deeper look. >
Nothing particular, I am using kexec to boot new kernels, and all my attempts with your patch included demonstrated the issue. eth0 is a bonding device, it might matter, I do not know. We also have some tunnels, but unfortunately I can not provide a setup that you could use on say a VM. I can send you the .config if this can help > >> > >> I am assuming you are quite confident it is this change? > > > > At least, reverting the patch resolves the issue for me. > > > > Keeping fib (and their reference to netdev) is apparently too much, > > we probably need to implement a refcount on the metrics themselves, > > being stand alone objects. > > I don't disagree, just that it may need to change too much code which > goes beyond a stable candidate. Well, your choice, but dealing with a full blown fib and its dependencies look fragile to me.