The more I look at this, the more I'm convinced *most* of the real problem lies in that ipconfig tool. Yes, various kernel changes seem to make it alter between working & not working under the circumstances (which is bizarre), but unless something is specifically interfering with the inter-process communication, ipconfig appears to be ignoring valid dhcp responses, just based on whether you tell it "all" interfaces vs telling it a specific interface.
A small modification could be made to the initramfs-tools to have it iterate over the interfaces in the system one-at-a-time. It would marginally slow down the boot should the relevant interface not be the first, but it would get rid of this bug entirely. Or the intird environment could be modified to use dhclient instead of ipconfig (dhclient appears to be in the initrd, and works perfectly fine when called in a generic fashion, though the other initramfs-tools scripts seem aware ipconfig didn't complete successfully which I haven't looked in to) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe 0000:13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe 0000:13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56 broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe 0000:13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts The offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp