We've been doing a lot more testing and debugging and I'd like to share our findings:
1) Unfortunately it turns out this change does not fix the issue of interfaces not coming up correctly for a bond with a (static) network configuration. The race condition seems to be removed so at least there are no more hangs between bonds and their vlan children. All the interfaces also say they are UP both when running ifup and after reboot. However: - Running "ifup <slavename>" does bring up the bond (and its vlans) in a working state. - Running "ifup -a" or rebooting don't actually work, causing "network not available" errors and "Destination Host Unreachable" when pinging other machines. Executing "ifdown -a; ifup -a" shows that ifupdown tries to bring up the bond BEFORE the slaves in stead of the other way around. Even though after the 60s timeout the bond and it's slaves say they are UP, they don't actually function. - We're not seeing any issues with bonds that do not have a network configuration of their own 2) The networking script stack / concept seems fundamentally flawed in three areas: 2.A) bonds relying on slaves having "bond-master" and being started by bringing up the slaves, but not supporting the master having "bond- slaves" and being able to start a bond by just bringing up the bond directly. 2.B) bringing a specific interface up automatically brings up it's child vlans. This does not make a lot of sense. The other way around does - e.g. in order to bring up a vlan we need to bring up it's raw device - but why would the ifupdown scripts assume that I want to bring up all of it's vlans when I bring up an interface that (also) serves as a raw device? In that case I would probably run "ifup -a"! 2.C) a vlan running on top of a bond cannot be brought up directly due to /sys/class/net/<bondname>/ not existing. This results in the following: > # ifup bo-adm.2 > Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config > cat: /sys/class/net/bo-adm/mtu: No such file or directory > Device "bo-adm" does not exist. > bo-adm does not exist, unable to create bo-adm.2 > run-parts: /etc/network/if-pre-up.d/vlan exited with return code 1 > Failed to bring up bo-adm.2. 3) Our new workaround for boot has become this very intrusive systemd service: > [Unit] > Wants=network-online.target > After=network-online.target > > [Install] > WantedBy=multi-user.target > > [Service] > Type=oneshot > ExecStartPre=/sbin/ifdown bo-adm > ExecStart=/sbin/ifup enp0s3 > ExecStart=/sbin/ifup enp0s10 > ExecStop=/sbin/ifdown bo-adm > RemainAfterExit=yes > TimeoutStartSec=5min -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ifupdown in Ubuntu. https://bugs.launchpad.net/bugs/1701023 Title: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion Status in ifupdown package in Ubuntu: In Progress Status in vlan package in Ubuntu: In Progress Status in ifupdown source package in Trusty: In Progress Status in vlan source package in Trusty: In Progress Status in ifupdown source package in Xenial: In Progress Status in vlan source package in Xenial: In Progress Status in ifupdown source package in Artful: In Progress Status in vlan source package in Artful: In Progress Status in ifupdown source package in Bionic: In Progress Status in vlan source package in Bionic: In Progress Status in ifupdown package in Debian: Fix Released Status in vlan package in Debian: New Bug description: When upgrading from version 1.9-3ubuntu10.1, a previously working machine can't successfully reboot completely. ifup is hanging indefinitely, with this process structure (from "pstree -a 1299"): ifup,1299 -a └─run-parts,1501 /etc/network/if-pre-up.d └─bridge,1502 /etc/network/if-pre-up.d/bridge └─bridge,1508 /etc/network/if-pre-up.d/bridge └─vlan,1511 /etc/network/if-pre-up.d/vlan └─ifup,1532 eth0 <begin content of /etc/network/interfaces> auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.10.65 netmask 255.255.255.192 gateway 192.168.10.66 auto eth0.11 address 192.168.11.1 netmask 255.255.255.0 auto br1134 iface br1134 inet manual bridge_ports eth0.1134 bridge_stp off bridge_fd 0 <end content of /etc/network/interfaces> The underlying interface eth0.1134 is not explicitly defined, but was previously auto-created during "ifup -a" execution. This apparently fails now. Reverting back to the 10.1 version re-establishes old behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp