Three years later (!), the problem still exists (stretch).

Tracked it down to a conflict between the udev trigger script
(/lib/udev/ifupdown-hotplug) and the networking init script
(function ipup_hotplug()):

The udev trigger starts before the networking script, calls ifup
on the interface, which keeps it locked until finished. As the dhcp
client retries for a while, the interface remains locked.

Now, few secs later, the networking script comes in and explicitly
calls ifup on the very same interface, which is still locked.

That stalls until one of the parallel running ifup's is giving up.

This is clearly an ugly bug, which needs to be resolved - the current
logic is obviously wrong: having both the udev trigger and the
explicit ifup calls together just doesn't work properly.

Here's a quick workaround (just tested on one box):

--- networking.orig     2016-09-16 00:00:00.000000000 +0200
+++ networking  2019-09-04 14:27:56.873709156 +0200
@@ -99,6 +99,17 @@
 }

 ifup_hotplug () {
+    ## this conflicts with /lib/udev/ifupdown-hotplug and causes
+    ## boot to hang several minutes when interface is configured to
+    ## DHCP and doesn't get any DHCP response.
+    ## in that case the udev trigger will try for several minutes
+    ## and keeping the interface locked, while the ifup_hotplug()
+    ## function tries to configure the interface again, causing the
+    ## bootup to stall because of waiting for the lock
+    ##
+    ## IMHO, this is function is completely wrong in case of the
+    ## udev trigger already doing the probing
+    return 0
     if [ -d /sys/class/net ]
     then
            ifaces=$(for iface in $(ifquery --list --allow=hotplug)


A cleaner solution would be explicitly checking skipping already
locked interfaces here.


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287

Reply via email to