On Sat, Jan 08, 2022 at 12:03:52AM +0100, Tomas Hlavacek wrote: > Hi! > > The large table that BIRD pulled from the kernel was a FNHE table > where Linux collects PMTU records for *all* destination IPs that are > routed to the tunnel (which does not seem to be right and I will > discuss it in LKML shortly). These records have (default) 600s > expiration time and in my scenario I happen to receive some > backscatter traffic that in most cases gets ICMP or TCP reset > responses that could ultimately create millions of these records in a > few minutes. > > The reason why this problem occured only in Linux ~5.2+ lies in the > patch > https://patchwork.ozlabs.org/project/netdev/patch/8d3b68cd37fb5fddc470904cdd6793fcf480c6c1.1561131177.git.sbri...@redhat.com/ > that changed the semantics of netlink dump requests. Now the kernel > dumps the FIB Next Hop Exceptions table (previously known as route > cache) alongside the RT unless the requester sets sockopt > NETLINK_GET_STRICT_CHK and clear the flag RTM_F_CLONED in the dump > request. BIRD does not apply the filters so the kernel dumps > everything. But iproute2 and other programs that use netlink utilize > the filters, so no similar performance issue occurs unless I > explicitly dump the FNHE table (ip route show cache).
Hi Thanks, that seems like plausible explanation. Being spammed by PMTU cache entries where requesting route table dumps is a creative interpretation of stable API commitment :-( > I believe that many different types of Linux tunnels create the PMTU > records for all packets transmitted over the tunnel as well. And it > works like that for a long time - the code that creates the route > cache (at that time, now it is FNHE table) records has been introduced > in Linux 3.10 > (https://elixir.bootlin.com/linux/v3.10/source/net/ipv4/ip_tunnel.c#L591). If i understand it correctly, these PMTU records can also be a result of regular TCP communication from/to the router even if there are no tunnels? > Regardless of what may or may not happen on the kernel side I think > that implementing the netlink filter in BIRD to avoid the described > situation makes sense. I am almost certain that my experimental fix > breaks other things (most likely OSPF) but I would be glad to help > make it right. How could OSPF be affected by filters on netlink socket? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."