Public bug reported: On Nvidia DGX2 system, we configured linux bridge (br0) using host physical NIC interface and it is using static IP (see below netplan file). BTW, we are using 18.04.2 based BaseOS and Guest images.
- All KVM guests are being launched using virtual network interface based on br0. All VMs are getting DHCP based IP address and network interface works fine for few hours (may be upto 24hours). - After that we are noticing these VMs are losing IP address and noticed the message in VM’s syslog "Feb 26 17:16:41 test-1g0 systemd-networkd[3479]: enp6s0: DHCP lease lost". - At this point, we tried to create new VMs using br0 and none of them are getting any IP address. - Then, we checked KVM host, and status of bridge but we didn’t see any error. Tried to unconfigure br0 by removing bridge configuration from host netplan and did “sudo netplan apply” but br0 is still there. It seems like bridge has in weird state and cannot unload this driver. Guest lab@dgx-server-vm:~$ ssh nvidia@192.168.123.138 The authenticity of host '192.168.123.138 (192.168.123.138)' can't be established. ECDSA key fingerprint is SHA256:k8XpnGH7yle76z46CX16pflYVeYcKoG6kWCymIkv0kk. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.123.138' (ECDSA) to the list of known hosts. nvidia@192.168.123.138's password: _ _ _ _ _ _ _ _ ___ | \ | |_ _(_) __| (_) __ _ | |_ ___ ___| |_ / | __ _ / _ \ | \| \ \ / / |/ _` | |/ _` | | __/ _ \/ __| __|____| |/ _` | | | | | |\ |\ V /| | (_| | | (_| | | || __/\__ \ ||_____| | (_| | |_| | |_| \_| \_/ |_|\__,_|_|\__,_| \__\___||___/\__| |_|\__, |\___/ |___/ Welcome to Ubuntu 18.04.2 LTS (4.15.0-45-generic) Welcome to NVIDIA DGX KVM VM Server Version 4.0.5 (GNU/Linux 4.15.0-45-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage System information as of: Wed Feb 27 12:20:21 PST 2019 System load: 0.00 IP Address: Memory usage: 0.0% (59.36G avail) System uptime: 21:04 hours Usage on /: 8% (44G free) Swap usage: 0.0% Local Users: 1 Processes: 158 System information as of Wed Feb 27 12:20:22 PST 2019 System load: 0.0 Processes: 155 Usage of /: 6.7% of 48.96GB Users logged in: 1 Memory usage: 0% IP address for enp1s0: 192.168.123.138 Swap usage: 0% IP address for docker0: 172.17.0.1 * Canonical Livepatch is available for installation. - Reduce system reboots and improve kernel security. Activate at: https://ubuntu.com/livepatch 15 packages can be updated. 9 updates are security updates. Last login: Wed Feb 27 12:05:09 2019 nvidia@test-1g0:~$ ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 ether 02:42:5c:b9:6f:94 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.123.138 netmask 255.255.255.0 broadcast 192.168.123.255 inet6 fe80::5054:ff:feb9:b8a1 prefixlen 64 scopeid 0x20<link> ether 52:54:00:b9:b8:a1 txqueuelen 1000 (Ethernet) RX packets 38879 bytes 2449778 (2.4 MB) RX errors 0 dropped 1 overruns 0 frame 0 TX packets 977 bytes 132770 (132.7 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 enp6s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::5055:ff:fe78:faa9 prefixlen 64 scopeid 0x20<link> ether 52:55:00:78:fa:a9 txqueuelen 1000 (Ethernet) RX packets 93842 bytes 7637062 (7.6 MB) RX errors 0 dropped 27 overruns 0 frame 0 TX packets 1874 bytes 442869 (442.8 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 562 bytes 52271 (52.2 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 562 bytes 52271 (52.2 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 nvidia@test-1g0:~$ uptime 12:20:35 up 21:04, 2 users, load average: 0.00, 0.00, 0.00 nvidia@test-1g0:~$ date Wed Feb 27 12:20:44 PST 2019 nvidia@test-1g0:~$ route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 192.168.123.0 0.0.0.0 255.255.255.0 U 0 0 0 enp1s0 nvidia@test-1g0:~$ dmesg | grep -i DHCP nvidia@test-1g0:~$ cat /var/log/syslog | grep -i dhcp Feb 26 15:15:21 test-1g0 systemd-networkd[569]: enp1s0: DHCPv4 address 192.168.123.138/24 via 192.168.123.1 Feb 26 15:16:20 test-1g0 systemd-networkd[538]: enp1s0: DHCPv4 address 192.168.123.138/24 via 192.168.123.1 Feb 26 15:16:20 test-1g0 systemd-networkd[538]: enp6s0: DHCPv4 address 172.18.232.32/25 via 172.18.232.1 Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: enp1s0: DHCPv4 address 192.168.123.138/24 via 192.168.123.1 Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: enp6s0: DHCPv4 address 172.18.232.32/25 via 172.18.232.1 Feb 26 17:16:41 test-1g0 systemd-networkd[3479]: enp6s0: DHCP lease lost nvidia@test-1g0:~$ sudo networkctl status enp6s0 [sudo] password for nvidia: ● 3: enp6s0 Link File: /lib/systemd/network/99-default.link Network File: /run/systemd/network/10-netplan-virtionetworks.network Type: ether State: degraded (configured) Path: pci-0000:06:00.0 Driver: virtio_net Vendor: Red Hat, Inc. Model: Virtio network device HW Address: 52:55:00:78:fa:a9 Address: fe80::5055:ff:fe78:faa9 nvidia@test-1g0:~$ systemctl status systemd-networkd.service ● systemd-networkd.service - Network Service Loaded: loaded (/lib/systemd/system/systemd-networkd.service; enabled-runtime; vendor preset: enabled) Active: active (running) since Tue 2019-02-26 15:16:42 PST; 21h ago Docs: man:systemd-networkd.service(8) Main PID: 3479 (systemd-network) Status: "Processing requests..." Tasks: 1 (limit: 4915) CGroup: /system.slice/systemd-networkd.service └─3479 /lib/systemd/systemd-networkd Feb 26 15:16:42 test-1g0 systemd[1]: Started Network Service. Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: lo: Link is not managed by us Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: enp1s0: Link is not managed by us Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: docker0: Link is not managed by us Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: lo: Link is not managed by us Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: docker0: Link is not managed by us Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: enp1s0: DHCPv4 address 192.168.123.138/24 via 192.168.123.1 Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: enp6s0: DHCPv4 address 172.18.232.32/25 via 172.18.232.1 Feb 26 15:16:42 test-1g0 systemd-networkd[3479]: enp6s0: Configured Feb 26 17:16:41 test-1g0 systemd-networkd[3479]: enp6s0: DHCP lease lost ** Affects: systemd (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1817998 Title: KVM Guest - DHCP lease lost (Ubuntu 18.04) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1817998/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs