Hello Gert,
On Wed, 11 Aug 2021 at 18:31, Gert Doering <g...@greenie.muc.de> wrote: > > Hi Lukas, > > small world :-) Yes, indeed :P > On Wed, Aug 11, 2021 at 04:06:38PM +0200, Lukas Tribus wrote: > > in an environment with 2 openvpn instances (one TCP and one UDP), > > previous unattended-upgrades of the openvpn package in Ubuntu 18.04 > > caused outages, because the UDP based instance was unable to add > > routes to the kernel: > > > > /sbin/ip route add 192.168.20.0/24 via 10.20.30.2 > > ERROR: Linux route add command failed: external program exited with > > error status: 2 > > /sbin/ip route add 10.20.30.0/24 via 10.20.30.2 > > ERROR: Linux route add command failed: external program exited with > > error status: 2 > > As Antonio says, it would be good to have a bit more log output > (with --verb 3 or 4) here. > > What does "ip route" show before/after starting the UDP instance? It's not yet reproducible, it happened only twice in 6 months. UDP instance covers a handful of remote clients, each with their own private /24 subnet, and the TCP instance covers a remote client, with another private /24 subnet. This happened during an unattended-upgrade from openvpn 2.4.4-2ubuntu1.5 to 2.4.4-2ubuntu1.6 on Ubuntu 18.04: TCP instance, old PID 947, new PID 8162->8167 UDP instance, old PID 1024, new PID 8187->8190 (I guess openvpn forks once at startup) root@prodhost-1:/var/log# grep openvpn syslog | grep "Aug 5 06:22" | grep -v Learn Aug 5 06:22:38 prodhost-1 unattended-upgrade: Packages that will be upgraded: linux-headers-generic linux-headers-virtual linux-image-extra-virtual linux-image-generic linux-image-virtual linux-libc-dev linux-virtual openvpn Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[947]: /sbin/ip route del 192.168.22.0/24 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[947]: openvpn_execve: unable to fork: Resource temporarily unavailable (errno=11) Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[947]: Exiting due to fatal error Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8162]: OpenVPN 2.4.4 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Jul 19 2021 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8162]: library versions: OpenSSL 1.1.1 11 Sep 2018, LZO 2.08 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: Diffie-Hellman initialized with 2048 bit key Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: ROUTE_GATEWAY 192.168.232.1 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: TUN/TAP device tun0 opened Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: TUN/TAP TX queue length set to 100 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: do_ifconfig, tt->did_ifconfig_ipv6_setup=0 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip link set dev tun0 up mtu 1500 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[1024]: event_wait : Interrupted system call (code=4) Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[1024]: /sbin/ip route del 192.168.20.0/24 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[1024]: openvpn_execve: unable to fork: Resource temporarily unavailable (errno=11) Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[1024]: Exiting due to fatal error Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip addr add dev tun0 local 10.20.60.1 peer 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.22.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.23.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.24.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.25.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.26.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8187]: OpenVPN 2.4.4 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Jul 19 2021 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8187]: library versions: OpenSSL 1.1.1 11 Sep 2018, LZO 2.08 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: Diffie-Hellman initialized with 2048 bit key Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: Deprecated TLS cipher name 'AES256-SHA', please use IANA name 'TLS-RSA-WITH-AES-256-CBC-SHA' Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.27.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: ROUTE_GATEWAY 192.168.232.1 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: TUN/TAP device tun2 opened Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: TUN/TAP TX queue length set to 100 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: do_ifconfig, tt->did_ifconfig_ipv6_setup=0 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: /sbin/ip link set dev tun2 up mtu 1500 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: /sbin/ip addr add dev tun2 local 10.20.30.1 peer 10.20.30.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.28.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.29.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: /sbin/ip route add 192.168.20.0/24 via 10.20.30.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: ERROR: Linux route add command failed: external program exited with error status: 2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: /sbin/ip route add 10.20.30.0/24 via 10.20.30.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: ERROR: Linux route add command failed: external program exited with error status: 2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 10.34.194.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: Could not determine IPv4/IPv6 protocol. Using AF_INET Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: Socket Buffers: R=[212992->212992] S=[212992->212992] Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: UDPv4 link local (bound): [AF_INET][undef]:1194 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: UDPv4 link remote: [AF_UNSPEC] Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: GID set to nogroup Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: UID set to nobody Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: MULTI: multi_init called, r=256 v=256 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: IFCONFIG POOL: base=10.20.30.4 size=62, ipv6=0 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: IFCONFIG POOL LIST Aug 5 06:22:47 prodhost-1 ovpn-openvpn_ubiquiti[8190]: Initialization Sequence Completed Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 192.168.51.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: /sbin/ip route add 10.20.60.0/24 via 10.20.60.2 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: Could not determine IPv4/IPv6 protocol. Using AF_INET Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: Socket Buffers: R=[131072->131072] S=[16384->16384] Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: Listening for incoming TCP connection on [AF_INET][undef]:1195 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: TCPv4_SERVER link local (bound): [AF_INET][undef]:1195 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: TCPv4_SERVER link remote: [AF_UNSPEC] Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: GID set to nogroup Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: UID set to nobody Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: MULTI: multi_init called, r=256 v=256 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: IFCONFIG POOL: base=10.20.60.4 size=62, ipv6=0 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: IFCONFIG POOL LIST Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: MULTI: TCP INIT maxclients=1024 maxevents=1028 Aug 5 06:22:47 prodhost-1 ovpn-openvpn_mikrotik[8167]: Initialization Sequence Completed As you can see, both the UDP instance as well as the TCP instance try to insert routes at the same moment, the TCP instance is slightly younger (a few ms maybe), and the route insertion fails. I think it would be best to run all the ip commands through "strace -ttAo instancename.log" to have an idea about the syscalls between iproute2 and kernel. Would this be possible with the "--iproute cmd" configuration, how would that configuration look like exactly? I mean how can I refer to subnets and gateways within the --iproute cmd config? > > iproute2 return code 2 indicates a kernel error. > > > > The openvpn instances downgrade privileges to nogroup/nobody, so the > > removal of the routes also fail, but when removing the tun interface, > > the routes will vanish anyway. > > Permission errors should be correctly logged as such, and downgrading > should only happen after ifconfig/route setup. Yes I can confirm. This causes some expected errors in the log, but I don't see any reason why this would have anything to do with the issue. > > I'm wondering if somebody has seen issues like this. Of course the > > error comes from the kernel, this could be some race condition due to > > two processes inserting routes at the same moment or something, but I > > have to find a way to do this reliably. > > I've never heard about race conditions of this sort. > > Are you trying to insert the *same* /24? Or does each instance have > their own subnets? The 2 instances use distinct subnet's. > > I'm also wondering about error handling. Failing to add routes means > > we have a non-working openvpn instance, but the ip route return code > > is only logged; it does not trigger a fatal exit of openvpn itself. So > > the parent process/process supervisor cannot possibly be aware of any > > problems. > > This is intentional, to some extent, to gracefully handle client side > issues ("the server pushes 192.168.1.0/24, but this is used as client > side LAN, so cannot be installed"). > > On the server side, maybe we should fail hard in this case... but it's > the same code. And the code is 15+ years old, so nobody around today > really knows for sure why certain design decisions have been made back > then... > > (I inherited that mess, I just wanted to add IPv6 support :-) ) Ah, it's all clear to me now ;) thanks, lukas _______________________________________________ Openvpn-users mailing list Openvpn-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openvpn-users