Package: iproute2 Version: 4.20.0-2 Severity: important Control: found -1 5.4.0-1
The ipnetns.c:netns_add() function sets up the /var/run/netns mount point in a way that is fragile to race conditions if the routine is entered to by multiple processes at the same time. If the race condition is triggered, some kind of mount point recursion explosion seems to happen, messing up the entire system in "interesting" ways. For example, /proc/self/mountinfo ends up with tons of duplicate entries, and the mountinfo file itself becomes so large that the entire system tends to slow down. Also, subsequent netns add commands might fail with the following error message (note that this doesn't always happen): mount --bind /run/netns /run/netns failed: No space left on device Since it is a race condition, the issue is hard to reproduce on its own, but it is possible to force it to happen by using strace to inject an artificial delay in the mount() system call. See below. I observed this race condition happen multiple times on a real production system during the boot process. This is because this particular system sets up network namespaces using systemd units. Because systemd is designed to start units in parallel, and due to cold caches, multiple units running "netns add" end up synchronizing with each other, making it quite likely the race condition will be triggered. STEPS TO REPRODUCE Do NOT follow this procedure on a system you care about. This procedure WILL mess up your system and likely require you to reboot! 1. Start from a fresh system that never ran "netns add" since boot (or just unmount /var/run/netns manually). I can reproduce it on Debian Buster (iproute2 4.20.0-2) as well as latest Sid (5.4.0-1). 2. Run the following bash script: --- for i in {0..9} do strace -e trace=mount -e inject=mount:delay_exit=1000000 ip netns add "testnetns$i" 2>&1 | tee "$i.log" & done wait --- 3. Look at /proc/self/mountinfo. Hilarity ensues. If you increase the count in the script you might even get to see some "mount failed: No space left on device" errors. WORKAROUND Make sure that the first "netns add" command that runs after boot cannot run in parallel with any other "netns add" command. flock(1) might be useful here. I guess setting up the /var/run/netns point manually during boot, before any "netns add" command runs, might also work.