On Sat, Jan 31, 2015 at 03:06:37PM +0100, Ondrej Zajicek wrote: > On Sat, Jan 31, 2015 at 02:47:51PM +0100, Baptiste Jonglez wrote: > > This took several minutes to complete, and there certainly isn't so much > > IPv6 routes in the kernel: routes appear several times in the output of > > "ip -6 r". Running this command multiple times yields very different > > results each time. > > > > Thus, I don't think the bug is in Bird. Could it be some kind of race > > condition with netlink? I haven't been able to find any reference to this > > bug, either in the kernel or in iproute2. For reference, this is on a > > Debian wheezy system, but I can reproduce the duplicate routes in "ip -6 r" > > on Debian jessie as well. > > Interesting, what are the kernel versions in these Wheezy and Jessie systems? > Does the problem (with 'ip -6 r') appears also when BIRD is not running?
So far, I saw this behaviour in Wheezy (3.2.0-4-amd64 + iproute 20120521-3+b3), and Jessie (3.14.4-1 + iproute2 3.16.0-2). In both cases, Bird was running. I tried shutting down bird6 (with "persist" in the kernel protocol, so that routes stay in the kernel), and the problem seemed to persist: $ ip -6 r | wc -l 26720 $ ip -6 r | wc -l 21794 $ ip -6 r | wc -l 50321 $ ip -6 r | wc -l 37602 $ ip -6 r | wc -l 59011 However, the amplitude is very reduced (when Bird is running and talking to the kernel, doing the same thing yields millions of routes). > I wonder what factors are specific to this problem. I remember there were > a similar report or two few years ago, but these reports are too uncommon > to be an universal problem in IPv6 Linux forwarding. I actually found a way to reproduce this without Bird, just by inserting static routes. It seems that dumping the routing table while some routes are being inserted is enough to trigger the bug. Simple way to see the issue: run this on one shell watch -n 0.2 "ip -6 r | wc -l" and run this in another, root, shell: for i in {0000..3999}; do ip -6 r add unreachable 2001:db8:$i::/48 proto 57; sleep 0.005; done; sleep 30; ip -6 r flush proto 57 Then look at the output of watch: at first, the number of routes grows regularly, and then, at some point, it will start jumping up and down. It should be possible to write a small program that automatically tests the presence of the bug (for instance by checking that the number of routes reported by "ip -6 r" is always increasing). So far, using the little shell script above, I saw the issue with: - Wheezy (3.2.0-4-amd64 + iproute 20120521-3+b3) The bug didn't show up with: - Wheezy with a multipath TCP kernel (3.14.27 + iproute 201407242100) - Archlinux (3.18.2 + iproute2 3.17.0 ) - Archlinux (3.14.27-1-mptcp + iproute2 3.18.0-1)
pgpAF7XYPvavJ.pgp
Description: PGP signature