Tue, Oct 11, 2016 at 04:18:52AM CEST, dsah...@gmail.com wrote: >Jiri / Veaceslav: > >As author's of the adjacency tracking code in dev.c I am hoping you can help >with suggested patches for a couple of problems. The start point needs to >include commit 93409033ae65 which resolved a different problem from what I am >seeing now. > >At the moment I have 2 cases both for this topology: > +--------+ > | myvrf | > +--------+ > | | > | +---------+ > | | macvlan | > | +---------+ > | | > +----------+ > | bridge | > +----------+ > | > +--------+ > | bond0 | > +--------+ > | > +--------+ > | eth3 | > +--------+ > > >Base set of commands for both cases: > >ip link add bond1 type bond >ip link set bond1 up >ip link set eth3 down >ip link set eth3 master bond1 >ip link set eth3 up > >ip link add bridge type bridge >ip link set bridge up >ip link add macvlan link bridge type macvlan >ip link set macvlan up > >ip link add myvrf type vrf table 1234 >ip link set myvrf up > >ip link set bridge master myvrf > > >############################################################ ># case 1 > >ip link set macvlan master myvrf >ip link set bond1 master bridge > >ip link delete myvrf > >dmesg has a splat triggered in __netdev_adjacent_dev_remove() where you >currently see the BUG(). If you convert that to a WARN_ON (which it should be, >no need to panic on the remove path) it will show you 4 missing adjacencies: >eth3 - myvrf, mvrf - eth3, bond1 - myvrf and myvrf - bond1. All of those are >because the dev_link function does not link macvlan lower devices to myvrf >when it is enslaved. (Enable the debugging to see that those messages are >missing.) > > > >############################################################ ># case 2 > >This case just flips the ordering of the enslavements: > >ip link set bond1 master bridge >ip link set macvlan master myvrf > >Then run: >ip link delete bond1 >ip link delete myvrf > >The last command hangs because myvrf has a reference that has not been >released. If you do not have commit 93409033ae65 the delete of bond1 hangs for >the same reason. For this case, the debug messages show that the macvlan lower >devices (eth3 and bond1) are connected to myvrf on the enslavement, but the >link delete the path only removes one of them hence the unreleased refcnt on >myvrf. > > >In the end it seems that the code for the dependency graph is not making the >complete mesh which causes problems on the tear down. I have attempted a few >changes that so far fix 1 problem and uncover a different one. Hence the >request for help from the author's of this code.
Agreed. We need to fix the code to work with duplicates so the graph is complete. > >It seems like the complete mesh is not really needed, but cscope shows >spectrum, ixgbe and bonding all using the for_each upper and lower device >macros. > >Suggestions? Well other possibility is to traverse the tree recursively. But that is exactly why the colided lists of all uppers/lowers were introduced to avoid this. > >David