Public bug reported:

I have a VPS upgraded to Ubuntu 22.04 with a rather peculiar network
setup.  My /etc/netplan/90-vz-ens3.yaml is

    network:
      ethernets:
        ens3:
          addresses:
          - 80.209.x.y/32
          - 10.209.x.y/8
          - 2a02:7b40:x:y::1/128
          dhcp4: false
          dhcp6: false
          routes:
          - scope: link
            to: 169.254.0.1
            via: 0.0.0.0
          - to: default
            via: 169.254.0.1
          - to: default
            via: fe80::ffff:1:1
      version: 2

After a reboot there's a 50% chance that the VPS will fail to bring up the 
default route and will be unreachable over the network, making me reach for the 
emergency VNC console provided by the VPS provider.
When this happens, journalctl will say

    Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: Re-configuring 
with /run/systemd/network/10-netplan-ens3.network
    Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: DHCPv6 lease 
lost
    Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: Could not set 
route: Nexthop has invalid gateway. Network is unreachable
    Nov 10 04:31:56 egle.gedmin.as systemd-networkd[652]: ens3: Failed

and the routing table looks like this:

    10.0.0.0/8 dev ens3 proto kernel scope link src 10.209.225.198 
    169.254.0.1 dev ens3 proto static scope link 

When I VNC in, I can run `sudo netplan apply` and _usually_ the network
gets fixed.  More specifically, the default route gets added and ip r
reports

    default via 169.254.0.1 dev ens3 
    10.0.0.0/8 dev ens3 proto kernel scope link src 10.209.225.198 
    169.254.0.1 dev ens3 proto static scope link 

Today this happened again and I checked the netplan-generated
/run/systemd/network/10-netplan-ens3.network.  It looked correct to me:

    [Match]
    Name=ens3

    [Network]
    LinkLocalAddressing=ipv6
    Address=80.209.x.y/32
    Address=10.209.x.y/8
    Address=2a02:7b40:x:y::1/128

    [Route]
    Destination=169.254.0.1
    Gateway=0.0.0.0
    Scope=link

    [Route]
    Destination=0.0.0.0/0
    Gateway=169.254.0.1

    [Route]
    Destination=::/0
    Gateway=fe80::ffff:1:1

so this looks like an issue with systemd-networkd.

This bug is very similar to bug 2073869, except that _sometimes_ things
work and the VPS can reboot without losing network.  It might be a race
condition in systemd-networkd, applying the routes not in the right
order (AFAIU you need the 169.254.0.1 route to exist for the default
route to be possible?).

It looks rather similar to upstream bug
https://github.com/systemd/systemd/issues/28358 except there's no
suspend involved.


Background information, probably irrelevant to the bug:

The VPS is a KVM VM.  Its network configuration is confusing; there's
cloud-init that does something (and prints the routes to the journal on
boot), and then I see

    Nov 22 04:31:48 egle.gedmin.as qemu-ga[842]: info: guest-exec
called: "prl_nettool set --hostname egle.gedmin.as --ip
00:00:50:xx:xx:xx 80.209.x.y/255.255.255.255 10.209.x.y/255.0.0.0
2a02:7b40:x:y::1/128 --gateway 00:00:50:xx:xx:xx 169.254.0.1
fe80::ffff:1:1  --dns 00:00:50:xx:xx:xx 79.98.25.143 79.98.29.143"

which runs /usr/sbin/prl_nettool (part of the prl-nettoolp package,
installed locally, not from the Ubuntu archive), which runs a bunch of
shell in /usr/lib/parallels-tools/tools/scripts/, which in turn execute
a Python script /usr/lib/parallels-tools/tools/scripts/netplan-cfg.py
that overwrites /etc/netplan/90-vz-ens3.yaml on every boot and then
calls netplan apply.

This makes it a bit more complicated (but not impossible) to apply
workarounds like `on-link: true` suggested in bug 2073869.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: systemd 249.11-0ubuntu3.12
ProcVersionSignature: Ubuntu 5.15.0-126.136-generic 5.15.167
Uname: Linux 5.15.0-126-generic x86_64
ApportVersion: 2.20.11-0ubuntu82.6
Architecture: amd64
CasperMD5CheckResult: unknown
CloudArchitecture: x86_64
CloudID: nocloud
CloudName: unknown
CloudPlatform: nocloud
CloudSubPlatform: seed-dir (/var/lib/cloud/seed/nocloud-net)
Date: Fri Nov 22 08:34:52 2024
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Lsusb-t: /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
MachineType: Virtuozzo KVM
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-126-generic 
root=UUID=38220ddd-8acc-476c-a7a6-7eec445d7a28 ro maybe-ubiquity
SourcePackage: systemd
UpgradeStatus: Upgraded to jammy on 2024-09-11 (71 days ago)
dmi.bios.date: 04/01/2014
dmi.bios.release: 0.0
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.11.0-2.vz7.2
dmi.chassis.type: 1
dmi.chassis.vendor: Virtuozzo
dmi.chassis.version: Virtuozzo 7.6.0 PC (i440FX + PIIX, 1996)
dmi.modalias: 
dmi:bvnSeaBIOS:bvr1.11.0-2.vz7.2:bd04/01/2014:br0.0:svnVirtuozzo:pnKVM:pvrVirtuozzo7.6.0PC(i440FX+PIIX,1996):cvnVirtuozzo:ct1:cvrVirtuozzo7.6.0PC(i440FX+PIIX,1996):sku:
dmi.product.family: Virtuozzo Linux
dmi.product.name: KVM
dmi.product.version: Virtuozzo 7.6.0 PC (i440FX + PIIX, 1996)
dmi.sys.vendor: Virtuozzo

** Affects: systemd (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug jammy uec-images

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2089342

Title:
  systemd-networkd randomly fails to set route because "Nexthop has
  invalid gateway"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2089342/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to