We've been doing a lot more testing and debugging and I'd like to share
our findings:

1) Unfortunately it turns out this change does not fix the issue of interfaces 
not coming up correctly for a bond with a (static) network configuration. The 
race condition seems to be removed so at least there are no more hangs between 
bonds and their vlan children. All the interfaces also say they are UP both 
when running ifup and after reboot. However:
- Running "ifup <slavename>" does bring up the bond (and its vlans) in a 
working state.
- Running "ifup -a" or rebooting don't actually work, causing "network not 
available" errors and "Destination Host Unreachable" when pinging other 
machines. Executing "ifdown -a; ifup -a" shows that ifupdown tries to bring up 
the bond BEFORE the slaves in stead of the other way around. Even though after 
the 60s timeout the bond and it's slaves say they are UP, they don't actually 
function.
- We're not seeing any issues with bonds that do not have a network 
configuration of their own

2) The networking script stack / concept seems fundamentally flawed in
three areas:

2.A) bonds relying on slaves having "bond-master" and being started by
bringing up the slaves, but not supporting the master having "bond-
slaves" and being able to start a bond by just bringing up the bond
directly.

2.B) bringing a specific interface up automatically brings up it's child
vlans. This does not make a lot of sense. The other way around does -
e.g. in order to bring up a vlan we need to bring up it's raw device -
but why would the ifupdown scripts assume that I want to bring up all of
it's vlans when I bring up an interface that (also) serves as a raw
device? In that case I would probably run "ifup -a"!

2.C) a vlan running on top of a bond cannot be brought up directly due to 
/sys/class/net/<bondname>/ not existing. This results in the following:
>  # ifup bo-adm.2
>  Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
>  cat: /sys/class/net/bo-adm/mtu: No such file or directory
>  Device "bo-adm" does not exist.
>  bo-adm does not exist, unable to create bo-adm.2
>  run-parts: /etc/network/if-pre-up.d/vlan exited with return code 1
>  Failed to bring up bo-adm.2.

3) Our new workaround for boot has become this very intrusive systemd service:
> [Unit]
> Wants=network-online.target
> After=network-online.target
> 
> [Install]
> WantedBy=multi-user.target
> 
> [Service]
> Type=oneshot
> ExecStartPre=/sbin/ifdown bo-adm
> ExecStart=/sbin/ifup enp0s3
> ExecStart=/sbin/ifup enp0s10
> ExecStop=/sbin/ifdown bo-adm
> RemainAfterExit=yes
> TimeoutStartSec=5min

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to ifupdown in Ubuntu.
https://bugs.launchpad.net/bugs/1701023

Title:
  (on trusty) version 1.9-3ubuntu10.4 regression blocking boot
  completion

Status in ifupdown package in Ubuntu:
  In Progress
Status in vlan package in Ubuntu:
  In Progress
Status in ifupdown source package in Trusty:
  In Progress
Status in vlan source package in Trusty:
  In Progress
Status in ifupdown source package in Xenial:
  In Progress
Status in vlan source package in Xenial:
  In Progress
Status in ifupdown source package in Artful:
  In Progress
Status in vlan source package in Artful:
  In Progress
Status in ifupdown source package in Bionic:
  In Progress
Status in vlan source package in Bionic:
  In Progress
Status in ifupdown package in Debian:
  Fix Released
Status in vlan package in Debian:
  New

Bug description:
  When upgrading from version 1.9-3ubuntu10.1, a previously working
  machine can't successfully reboot completely.

  ifup is hanging indefinitely, with this process structure (from
  "pstree -a 1299"):

  ifup,1299 -a
    └─run-parts,1501 /etc/network/if-pre-up.d
        └─bridge,1502 /etc/network/if-pre-up.d/bridge
            └─bridge,1508 /etc/network/if-pre-up.d/bridge
                └─vlan,1511 /etc/network/if-pre-up.d/vlan
                    └─ifup,1532 eth0

  
  <begin content of /etc/network/interfaces>
  auto lo
  iface lo inet loopback

  auto eth0
  iface eth0 inet static
    address 192.168.10.65
    netmask 255.255.255.192
    gateway 192.168.10.66

  auto eth0.11
    address 192.168.11.1
    netmask 255.255.255.0

  auto br1134
  iface br1134 inet manual
    bridge_ports eth0.1134
    bridge_stp off
    bridge_fd 0
  <end content of /etc/network/interfaces>

  The underlying interface eth0.1134 is not explicitly defined, but was
  previously auto-created during "ifup -a" execution. This apparently
  fails now.

  Reverting back to the 10.1 version re-establishes old behavior.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to