Hi, 2016-02-16 18:48 GMT+01:00 Ondrej Zajicek <santi...@crfreenet.org>:
> On Wed, Feb 10, 2016 at 04:38:03PM +0100, Mikhail Sennikovskii wrote: > > Hello Fellows, > > > > as a follow-up for the recent discussion about BIRD support of IPv6 ECMP > on Linux > > ( http://trubka.network.cz/pipermail/bird-users/2016-January/010163.html > ), > > I've created a patch, which addresses this problem. > > Hi > > Thanks for the patch, i have several comments: > > 1) In the patch, the incoming direction (collection from kernel) is > handled in unix/krt.c, while outgoing direction (krt_replace_rte()) is > handled in linux/netlink.c . It would make more sense to handle both > directions in the same place. As BIRD uses internally representation > with one route and mpnh structure, i would prefer to move collection > also to linux/netlink.c, so generic code in unix/krt.c would not > care about OS-specific interface (and does not need any changes). > Linux code would just collect next hops and then call krt_got_route() > for the whole multipath route. > Agreed, will fix. 2) Function krt_replace_rte() should be smarter than just flushing the > whole old multipath rte. It should compare old and new next hops (we > could suppose they are sorted) and either add missing I'm not sure for now whether it's possible to make this work, as netlink returns EEXIST whenever one tries to add a route for the network, already present in the routing table. This is why removing and re-installing seemed the best solution for me. or remove > surplussing next hops by calling nl_send_route(). > Just removing should work, yes. > 3) When collecting received routes, you should avoid creating list > of routes for the same net, and then merging such list by rt_merge_list(), > you could simply collect next hops (struct mpnh *) for the current route. > I can do it this way. > > 4) Also rt_merge_list() uses mpnh_merge_rta() which uses rte_update_pool. > That is probably not safe outside of nest update code. The linux netlink > code should have its own linpool (static and shared, because netlink socket > is also shared), so you could allocate mpnh structures from it. The linpool > should be flushed when the route is collected and propagated to > krt_got_route(). > So you don't need to do rta_lookup() for temporary next hops. > Right, will do. > Note that such linpool could be also used for IPv4 multipath next hops in > nl_parse_multipath() instead of nh_buffer. > I can add this fix as well, probably as a separate patch. > Note that the patch still does not support the learn mode for IPv6 ECMP. > > The support for it can be added later. > > Well, the learn mode is not really a problem. With the modifications > mentioned above, learning would work automatically (it is handled by > unix/krt.c, regardless of how route is generated). > > What is problematic is handling asynchronous notifications (because you > don't know inside netlink code what are other next hops and whether you > will get another one related to the same net). > Sorry, I guess I just mixed up with bird terminology. I thought the learn mode is directly related to handling these asynchronous notifications and adjusting the bird routes set accordingly. Regards, Mikhail > > -- > Elen sila lumenn' omentielvo > > Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org) > OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) > "To err is human -- to blame it on a computer is even more so." >