Can you please enable debug logs on systemd-networkd, and attach logs from a time where the bug occurs? E.g.,
$ mkdir -p /etc/systemd/system/systemd-networkd.service.d $ cat > /etc/systemd/system/systemd-networkd.service.d/debug.conf << EOF [Service] Environment=SYSTEMD_LOG_LEVEL=debug EOF $ reboot [...] $ journalctl -u systemd-networkd.service -b > systemd- networkd.service.log And attach the result? This should give more detail on how/when networkd is attempting to create routes. ** Changed in: systemd (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/2089342 Title: systemd-networkd randomly fails to set route because "Nexthop has invalid gateway" Status in systemd package in Ubuntu: Incomplete Bug description: I have a VPS upgraded to Ubuntu 22.04 with a rather peculiar network setup. My /etc/netplan/90-vz-ens3.yaml is network: ethernets: ens3: addresses: - 80.209.x.y/32 - 10.209.x.y/8 - 2a02:7b40:x:y::1/128 dhcp4: false dhcp6: false routes: - scope: link to: 169.254.0.1 via: 0.0.0.0 - to: default via: 169.254.0.1 - to: default via: fe80::ffff:1:1 version: 2 After a reboot there's a 50% chance that the VPS will fail to bring up the default route and will be unreachable over the network, making me reach for the emergency VNC console provided by the VPS provider. When this happens, journalctl will say Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: Re-configuring with /run/systemd/network/10-netplan-ens3.network Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: DHCPv6 lease lost Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: Could not set route: Nexthop has invalid gateway. Network is unreachable Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: Failed and the routing table looks like this: 10.0.0.0/8 dev ens3 proto kernel scope link src 10.209.225.198 169.254.0.1 dev ens3 proto static scope link When I VNC in, I can run `sudo netplan apply` and _usually_ the network gets fixed. More specifically, the default route gets added and ip r reports default via 169.254.0.1 dev ens3 10.0.0.0/8 dev ens3 proto kernel scope link src 10.209.225.198 169.254.0.1 dev ens3 proto static scope link Today this happened again and I checked the netplan-generated /run/systemd/network/10-netplan-ens3.network. It looked correct to me: [Match] Name=ens3 [Network] LinkLocalAddressing=ipv6 Address=80.209.x.y/32 Address=10.209.x.y/8 Address=2a02:7b40:x:y::1/128 [Route] Destination=169.254.0.1 Gateway=0.0.0.0 Scope=link [Route] Destination=0.0.0.0/0 Gateway=169.254.0.1 [Route] Destination=::/0 Gateway=fe80::ffff:1:1 so this looks like an issue with systemd-networkd. This bug is very similar to bug 2073869, except that _sometimes_ things work and the VPS can reboot without losing network. It might be a race condition in systemd-networkd, applying the routes not in the right order (AFAIU you need the 169.254.0.1 route to exist for the default route to be possible?). It looks rather similar to upstream bug https://github.com/systemd/systemd/issues/28358 except there's no suspend involved. Background information, probably irrelevant to the bug: The VPS is a KVM VM. Its network configuration is confusing; there's cloud-init that does something (and prints the routes to the journal on boot), and then I see Nov 22 04:31:48 egle.gedmin.as qemu-ga[842]: info: guest-exec called: "prl_nettool set --hostname egle.gedmin.as --ip 00:00:50:xx:xx:xx 80.209.x.y/255.255.255.255 10.209.x.y/255.0.0.0 2a02:7b40:x:y::1/128 --gateway 00:00:50:xx:xx:xx 169.254.0.1 fe80::ffff:1:1 --dns 00:00:50:xx:xx:xx 79.98.25.143 79.98.29.143" which runs /usr/sbin/prl_nettool (part of the prl-nettoolp package, installed locally, not from the Ubuntu archive), which runs a bunch of shell in /usr/lib/parallels-tools/tools/scripts/, which in turn execute a Python script /usr/lib/parallels- tools/tools/scripts/netplan-cfg.py that overwrites /etc/netplan/90-vz- ens3.yaml on every boot and then calls netplan apply. This makes it a bit more complicated (but not impossible) to apply workarounds like `on-link: true` suggested in bug 2073869. ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: systemd 249.11-0ubuntu3.12 ProcVersionSignature: Ubuntu 5.15.0-126.136-generic 5.15.167 Uname: Linux 5.15.0-126-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.6 Architecture: amd64 CasperMD5CheckResult: unknown CloudArchitecture: x86_64 CloudID: nocloud CloudName: unknown CloudPlatform: nocloud CloudSubPlatform: seed-dir (/var/lib/cloud/seed/nocloud-net) Date: Fri Nov 22 08:34:52 2024 Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Lsusb-t: /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M MachineType: Virtuozzo KVM ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-126-generic root=UUID=38220ddd-8acc-476c-a7a6-7eec445d7a28 ro maybe-ubiquity SourcePackage: systemd UpgradeStatus: Upgraded to jammy on 2024-09-11 (71 days ago) dmi.bios.date: 04/01/2014 dmi.bios.release: 0.0 dmi.bios.vendor: SeaBIOS dmi.bios.version: 1.11.0-2.vz7.2 dmi.chassis.type: 1 dmi.chassis.vendor: Virtuozzo dmi.chassis.version: Virtuozzo 7.6.0 PC (i440FX + PIIX, 1996) dmi.modalias: dmi:bvnSeaBIOS:bvr1.11.0-2.vz7.2:bd04/01/2014:br0.0:svnVirtuozzo:pnKVM:pvrVirtuozzo7.6.0PC(i440FX+PIIX,1996):cvnVirtuozzo:ct1:cvrVirtuozzo7.6.0PC(i440FX+PIIX,1996):sku: dmi.product.family: Virtuozzo Linux dmi.product.name: KVM dmi.product.version: Virtuozzo 7.6.0 PC (i440FX + PIIX, 1996) dmi.sys.vendor: Virtuozzo To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2089342/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp