On Thu, May 13, 2021 at 4:12 PM Steve Langasek <steve.langa...@ubuntu.com> wrote: > > Hi there, > > On Wed, May 12, 2021 at 05:52:07PM +1000, Christopher James Halse Rogers > wrote: > > There's an nfs-utils SRU¹ hanging around waiting for a policy decision on > > use of the After=network-online.target systemd unit dependency. I'm not an > > expert here, but it looks like part of my SRU rotation today is starting the > > discussion on this so we can resolve it one way or another! > > > I am not an expert in this area, but as I understand it, the tradeoff here > > is: > > 1. Without a dependency on After=network-online.target there is no guarantee > > that the network interface(s) will be usable at the time the nfs-utils unit > > triggers, and nfs-utils will fail if the relevant ntwork interface is not > > usable, or > > 2. With a dependency on After=network-online.target nfs-utils will reliably > > start, but if there are any interfaces which are configured but do not come > > up this will result in the boot hanging until the timeout is hit. > > > In mitigation of (2), there are apparently a number of default packages > > which already have a dependency on After=network-online.target, so boot > > hanging if interfaces are down is the status quo? > > From one of the comments in the bug report, I gathered that systemd upstream > (who, specifically?) was taking a position that distributions should not use > After=network-online.target. I think this is entirely unhelpful; the target > exists for this purpose, it is not required for systemd internally to get > the system up but exists only for other services to depend on. > > There are risks of services not starting on boot because the network-online > target is not reached. However, that is not the same thing as a "hung > boot", because other services will still start on their own, and things like > gdm and tty don't depend on network-online.target, *unless* you're in a > situation where you've introduced a dependency between the filesystem and > network-online. This is possible when we're talking about nfs, because the > same system might both export nfs filesystems and mount them from localhost. > But I'm not sure it should block this specific change. > > > The obvious thing to do here would be to follow Debian, but as far as I can > > tell there is not currently a Debian policy about this - the best I can find > > is an ancient draft of a best-practises-guide² suggesting that pacakages > > SHOULD handle networking dynamically, but if they do not MUST have a > > dependency on After=network-online.target > > > As far I understand it, handling networking dynamically requires upstream > > code changes (although maybe fairly simple code changes?). > > It does require upstream code changes; not always simple. And it's not > always *correct* to make upstream code changes instead of simply starting > the service when the system is "online"; you can find a number of examples > in Ubuntu of services that it only makes sense to start once your network is > "up" - e.g. apt-daily.service, update-notifier, whoopsie, ... > > > There are issues with the network-online target, to be sure. There is not a > clear definition of the target, and there have definitely been > implementation bugs in what does/does not block the target. I've had > discussions with the Foundations Team in the past about this but it has yet > to result in a specification. > > My working definition of what network-online.target SHOULD mean is: > > - at least one interface is up, with routes > - all interfaces which are 'optional: no' (netplan sense) are up > - including completion of ipv6 RA and ipv4 link-local if enabled on the > interface > - there is a default route for at least one configured address family > - attempts to discover default routes for other configured address families > have completed (success or fail) > - DNS is configured > > Thinks that must not block the network-online target: > - interfaces that are marked 'optional: yes' > - address sources that are listed in 'optional-addresses' for an interface > - default route for an address family for which no interfaces have > addresses > > At least historically, neither networkd nor NetworkManager has fulfilled > this definition. It would be nice to get there, but the first step is > having some agreed definition such as the above so that we can treat > deviations as bugs. >
If netplan.io can implement that would be nice. I.e. either synthetically (i.e. by generating a service unit on the fly that calls systemd-networkd-wait-online with extra arguments specifying all the non-optional interfaces) , or by creating a new binary which is "netplan-wait-online" which will be wanted by network-online.target and perform all of the above. > > It seems unlikely that, whatever we decide, we'll immediately do a full > > sweep of the archive and fix everything, so it looks like our choice is > > between: > > > 1. The long-term goal is to have no After=network-online.target dependencies > > in default boot (stretch goal: in main). Whenever we run into a > > package-fails-if-network-is-not-yet-up bug, we patch the code and submit > > upstream. Over time we audit existing users of After=network-online.target > > and patch them for dynamic networking, as time permits. > > > 2. We don't expect to be able to reach no After=network-online.target > > dependencies in the default boot, so it's not a priority to avoid them. > > Whenever we run into a package-fails-if-network-is-not-yet-up bug, we add an > > After=network-online.target dependency. > > 3. We expect to reach network-online.target in the common case, but accept > that there are systems for which it will ordinarily not be reached on boot > (i.e. offline systems). Services which depend on network-online.target > should be those which it is reasonable to not start if the system is not > connected to the Internet. This includes systems that are connected to a > local network, but have no default route. > So from my point of view a short term fix of like having After=network-online.target or even [Unit] After=systemd-resolved.service [Service] ExecStartPre=-/lib/systemd/systemd-networkd-wait-online --any --timeout 30 Is fine to be SRUed. However, I still have the same question - what if network connectivity drops & gets re-established? Should we bounce the network-online.target (aka restart it)? We can declare for units to be restarted, when network-online.target is restarted, if they otherwise themselves are incapable to dynamically detect networking loss & networking resumption. > > If we use this as the standard, it's easy to see that *in principle* > nfs-utils shouldn't depend on there being a route to the global Internet. > It does, however, at least give us a framework for understanding the > behavior, and for users to modify the behavior if they have different > requirements. > > > None of this makes it any safer for an SRU, since at the end of the day if > users have such a config that is impacted if you set > After=network-online.target for nfs-utils, it would still be a regression. > > -- > Steve Langasek Give me a lever long enough and a Free OS > Debian Developer to set it on, and I can move the world. > Ubuntu Developer https://www.debian.org/ > slanga...@ubuntu.com vor...@debian.org > -- > ubuntu-devel mailing list > ubuntu-devel@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel -- Regards, Dimitri. -- ubuntu-devel mailing list ubuntu-devel@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel