Re: ifupdown behaviour with IPv6 DAD failure (Was: proposal: Hybrid network stack for Trixie)

Philipp Kern Mon, 23 Sep 2024 08:49:18 -0700

On 23.09.24 13:39, Daniel Gröber wrote:

On Mon, Sep 23, 2024 at 12:25:15PM +0200, Chris Hofstaedtler wrote:

* Pierre-Elliott Bécue <p...@debian.org> [240923 11:34]:

I like ifupdown. It's simple and just works.


I find this quite funny, given a recent discussion about IPv6 dad
issues with ifupdown on #debian-admin.


The "discussion" was about ifup@eth0 being in a failed state on a
particular server due to a DAD failure and someone having to manually
intervene.


I find my ghost being invoked here.

Chris, what behaviour do you expect here? Below I'm going to assume what
you're getting at is that we should continue to retry DAD.

To me going to a stable failure state seems desirable. Continuing to re-try
for IPs could cause instability in the face of legitimate address
conflicts: when the owning machine reboots the conflicting machine would
now win the IP due to continous retrying. The change in owner would cause
disruption to services entirely unrelated to the machine that was just
rebooted.

DAD did not fail, it timed out after 60 sleeps of 0.1, aka 6s. Thekernel subsequently succeeded to configure the network. The script inquestion was added in response to [1] and [2] to have a pause duringboot to give the kernel time to resolve the situation before continuingthe bootup. So it left the race around because there's not that much itcan do better as a script-based setup without much state.

Unfortunately there's zero information from ifup@eth0 in the process asto when that happened. Which adds to the frustrating debugging storieswhen you can't get enough intel about what happened after the fact.(Which to be fair, also probably needs env vars to be set withsystemd-networkd to increase the debug level.) As far as I can seeprocesses started listening on the IP in question (that... again...wasn't logged because it's eaten by the script) a second afterwards.

So no, it did not enter a stable state. It let the kernel do its thing,which was to actually enable the address. I don't know why it takesLinux to run DAD for that long and what the assumptions around that are.But if you listen on netlink you learn when that happens and don't needto poll and could send events once that happens.

To be ultimately fair to ifupdown: There was probably not much of awinning move here. The annoying bit was the systemd service that wasstill in a failed state even though the failure condition resolveditself <1s later.


Kind regards
Philipp Kern

[1] https://www.agwa.name/blog/post/beware_the_ipv6_dad_race_condition
[2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=705996

Re: ifupdown behaviour with IPv6 DAD failure (Was: proposal: Hybrid network stack for Trixie)

Reply via email to