When testing upgrading OpenVPN from 2.3 to 2.4, we ran into an issue where every time the same client reconnects the MTU for the tunnel used shrinks, eventually leading to packets being dropped by OpenVPN.
Reproduced the problem with a 2 node setup, one as the server and the other one as a client. Each time the client is restarted the packet sizes shrink until it breaks. Configuration for the server: /etc/openvpn/server.conf port 1194 proto udp dev tun ca /etc/openvpn/easy-rsa/keys/ca.crt cert /etc/openvpn/easy-rsa/keys/server.crt key /etc/openvpn/easy-rsa/keys/server.key dh none server 192.168.1.0 255.255.255.0 topology subnet client-config-dir /etc/openvpn/ccd client-to-client keepalive 10 120 user nobody group nogroup persist-key persist-tun status /var/run/openvpn-status.log log /var/log/openvpn/openvpn.log verb 5 Configuration for the client: /etc/openvpn/client.conf client dev tun proto udp float remote 172.28.128.4 1194 user nobody group nogroup persist-key persist-tun ca /etc/openvpn/easy-rsa/keys/172.28.128.5-ca.crt cert /etc/openvpn/easy-rsa/keys/172.28.128.5.crt key /etc/openvpn/easy-rsa/keys/172.28.128.5.key log /var/log/openvpn/openvpn.log ns-cert-type server verb 5 What happens during the first time the client connects from the perspective of the logs is the following (server log): May 05 10:50:59 test-primary systemd[1]: Started OpenVPN connection to server. May 05 10:50:59 test-primary ovpn-server[17464]: Failed to extract curve from certificate (UNDEF), using secp384r1 instead. May 05 10:50:59 test-primary ovpn-server[17464]: ECDH curve secp384r1 added May 05 10:50:59 test-primary ovpn-server[17464]: TLS-Auth MTU parms [ L:1621 D:1212 EF:38 EB:0 ET:0 EL:3 ] May 05 10:50:59 test-primary ovpn-server[17464]: TUN/TAP device tun0 opened May 05 10:50:59 test-primary ovpn-server[17464]: TUN/TAP TX queue length set to 100 May 05 10:50:59 test-primary ovpn-server[17464]: do_ifconfig, tt->did_ifconfig_ipv6_setup=0 May 05 10:50:59 test-primary ovpn-server[17464]: /sbin/ip link set dev tun0 up mtu 1500 May 05 10:50:59 test-primary ovpn-server[17464]: /sbin/ip addr add dev tun0 192.168.1.1/24 broadcast 192.168.1.255 May 05 10:50:59 test-primary ovpn-server[17464]: Data Channel MTU parms [ L:1621 D:1450 EF:121 EB:406 ET:0 EL:3 ] May 05 10:50:59 test-primary ovpn-server[17464]: Could not determine IPv4/IPv6 protocol. Using AF_INET May 05 10:50:59 test-primary ovpn-server[17464]: Socket Buffers: R=[212992->212992] S=[212992->212992] May 05 10:50:59 test-primary ovpn-server[17464]: UDPv4 link local (bound): [AF_INET][undef]:1194 May 05 10:50:59 test-primary ovpn-server[17464]: UDPv4 link remote: [AF_UNSPEC] May 05 10:50:59 test-primary ovpn-server[17464]: GID set to nogroup May 05 10:50:59 test-primary ovpn-server[17464]: UID set to nobody May 05 10:50:59 test-primary ovpn-server[17464]: MULTI: multi_init called, r=256 v=256 May 05 10:50:59 test-primary ovpn-server[17464]: IFCONFIG POOL: base=192.168.1.2 size=252, ipv6=0 May 05 10:50:59 test-primary ovpn-server[17464]: Initialization Sequence Completed May 05 10:52:27 test-primary ovpn-server[17464]: MULTI: multi_create_instance called May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 Re-using SSL/TLS context May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 Control Channel MTU parms [ L:1621 D:1212 EF:38 EB:0 ET:0 EL:3 ] May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 Data Channel MTU parms [ L:1621 D:1450 EF:121 EB:406 ET:0 EL:3 ] May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 Local Options String (VER=V4): 'V4,dev-type tun,link-mtu 1541,tun-mtu 1500,proto UDPv4,cipher BF-CBC,auth SHA1,keysize 128,key-method 2,tls-server' May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 Expected Remote Options String (VER=V4): 'V4,dev-type tun,link-mtu 1541,tun-mtu 1500,proto UDPv4,cipher BF-CBC,auth SHA1,keysize 128,key-method 2,tls-client' May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 TLS: Initial packet from [AF_INET]172.28.128.5:1194, sid=59ea927a 09799019 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 VERIFY OK: depth=1, C=US, ST=CA, L=San Francisco, O=GitHub, OU=Enterprise, CN=GitHub Enterprise VPN, name=GitHub Enterprise VPN, emailAddress=enterpr...@github.com May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 VERIFY OK: depth=0, C=US, ST=CA, L=San Francisco, O=GitHub, OU=Enterprise, CN=172.28.128.5, name=GitHub Enterprise VPN, emailAddress=enterpr...@github.com May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_VER=2.4.0 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_PLAT=linux May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_PROTO=2 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_NCP=2 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_LZ4=1 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_LZ4v2=1 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_LZO=1 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_COMP_STUB=1 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_COMP_STUBv2=1 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 peer info: IV_TCPNL=1 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 ECDHE-RSA-AES256-GCM-SHA384, 2048 bit RSA May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5:1194 [172.28.128.5] Peer Connection Initiated with [AF_INET]172.28.128.5:1194 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 OPTIONS IMPORT: reading client specific options from: /etc/openvpn/ccd/172.28.128.5 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 MULTI: Learn: 192.168.1.2 -> 172.28.128.5/172.28.128.5:1194 May 05 10:52:27 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 MULTI: primary virtual IP for 172.28.128.5/172.28.128.5:1194: 192.168.1.2 May 05 10:52:28 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 PUSH: Received control message: 'PUSH_REQUEST' May 05 10:52:28 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 SENT CONTROL [172.28.128.5]: 'PUSH_REPLY,route-gateway 192.168.1.1,topology subnet,ping 10,ping-restart 120,ifconfig 192.168.1.2 255.255.255.0,peer-id 0,cipher AES-256-GCM' (status=1) May 05 10:52:28 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel MTU parms [ L:1549 D:1450 EF:49 EB:406 ET:0 EL:3 ] May 05 10:52:28 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel Encrypt: Cipher 'AES-256-GCM' initialized with 256 bit key May 05 10:52:28 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel Decrypt: Cipher 'AES-256-GCM' initialized with 256 bit key May 05 10:52:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 MULTI: bad source address from client [172.28.128.5], packet dropped What can be seen here is that the client initially connects and sets up the TLS control channel. The interesting MTU related bits are the first occurrence of "Data Channel MTU parms [ L:1621 D:1450 EF:121 EB:406 ET:0 EL:3 ]" from the initial setup which comes from https://github.com/OpenVPN/openvpn/blob/7a1b6a0dd706a81897457b0456a951c0b30bbcfb/src/openvpn/init.c#L4055. There's another entry for it with "Data Channel MTU parms [ L:1549 D:1450 EF:49 EB:406 ET:0 EL:3 ]" that comes from https://github.com/OpenVPN/openvpn/blob/f3705dd1e711ee9f8546b841e4b18e9e9a224975/src/openvpn/ssl.c#L1986. So there's an adjustment that happens twice, once for for the client initialization and once for the TLS connection setup. When a client reconnects, the following is shown in the logs: May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 TLS: new session incoming connection from [AF_INET]172.28.128.5:1194 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 VERIFY OK: depth=1, C=US, ST=CA, L=San Francisco, O=GitHub, OU=Enterprise, CN=GitHub Enterprise VPN, name=GitHub Enterprise VPN, emailAddress=enterpr...@github.com May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 VERIFY OK: depth=0, C=US, ST=CA, L=San Francisco, O=GitHub, OU=Enterprise, CN=172.28.128.5, name=GitHub Enterprise VPN, emailAddress=enterpr...@github.com May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_VER=2.4.0 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_PLAT=linux May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_PROTO=2 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_NCP=2 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_LZ4=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_LZ4v2=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_LZO=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_COMP_STUB=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_COMP_STUBv2=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 peer info: IV_TCPNL=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 TLS: move_session: dest=TM_ACTIVE src=TM_UNTRUSTED reinit_src=1 May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 TLS: tls_multi_process: untrusted session promoted to semi-trusted May 05 10:57:29 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 ECDHE-RSA-AES256-GCM-SHA384, 2048 bit RSA May 05 10:57:30 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 PUSH: Received control message: 'PUSH_REQUEST' May 05 10:57:30 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 SENT CONTROL [172.28.128.5]: 'PUSH_REPLY,route-gateway 192.168.1.1,topology subnet,ping 10,ping-restart 120,ifconfig 192.168.1.2 255.255.255.0,peer-id 0,cipher AES-256-GCM' (status=1) May 05 10:57:30 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel MTU parms [ L:1477 D:1450 EF:-23 EB:406 ET:0 EL:3 ] May 05 10:57:30 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel Encrypt: Cipher 'AES-256-GCM' initialized with 256 bit key May 05 10:57:30 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel Decrypt: Cipher 'AES-256-GCM' initialized with 256 bit key What can be seen here is that there's no client initialization since the connection setup is reused. This can also be seen from the "TLS: tls_multi_process: untrusted session promoted to semi-trusted" bit in the logs. This shows that we're reusing the TLS connection. What still happens though is another MTU data channel change in "Data Channel MTU parms [ L:1477 D:1450 EF:-23 EB:406 ET:0 EL:3 ]" happens. This means the channel is shrinking. This happens on each restart of a client: May 05 11:00:06 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel MTU parms [ L:1405 D:1405 EF:-95 EB:406 ET:0 EL:3 ] And after another restart: May 05 11:00:39 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 Data Channel MTU parms [ L:1333 D:1333 EF:-167 EB:406 ET:0 EL:3 ] After a few restarts, OpenVPN packets start getting dropped with the following error message: May 05 11:00:41 test-primary ovpn-server[17464]: 172.28.128.5/172.28.128.5:1194 TCP/UDP packet too large on write to [AF_INET]172.28.128.5:1194 (tried=1419,max=1333) This means not all traffic through the tunnel works anymore. This then leads to strange hard to debug issues where only bigger packets get dropped but smaller ones work. This issue is also reported in https://community.openvpn.net/openvpn/ticket/879. I've attached a patch and also opened a pull request on GitHub at https://github.com/OpenVPN/openvpn/pull/87 with the patch also attached to this email. I have no idea if this is the right approach or if there's another way to know if the connection is reused or not and if the frame calculation should be skipped or not. -- Cheers, Dirkjan
0001-Don-t-resize-the-MTU-windows-on-a-reused-connection.patch
Description: Binary data
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Openvpn-devel mailing list Openvpn-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openvpn-devel