Xen based VPS / OpenBSD 6.2 / OpenVPN 2.4.4 => Slow download speed after upgrade ================================================================================
Dear OpenBSD Community, we are operating an OpenVPN server on OpenBSD. A few days ago we upgraded to OpenBSD 6.2 and we are now seeing very slow speeds (<10KB/s) when trying to download via the VPN tunnel from the internet (WAN). We did not have this problem before. >From the documented test cases below (Specifically case 2) it does not look like it is a VPN performance problem (e.g. mtu/encryption performance related). We can also exclude bandwidth trottleing by the VPS provider and the ISP. * Did something essential change in `pf`? [4] * Or is the problem related to OpenBSD's Xen drivers? Could someone help us track down the bottleneck? Any help and hints are very much appreciated. Thank you kindly Berry PS: for a better viewing experience you may compile this email body with `asciidoc` == Environment === Server * OpenBSD 6.2 / amd64 (-release) + syspatch * OpenVPN 2.4.4 * On Virtual Private Server / Xen version "4.9.0" by Xen Project [0] * Detected CPU: Intel(R) Xeon(R) CPU E5-2620 * Detected network device: xnf0 * Firewall configuration: /etc/pf.conf [1] * System Message Buffer [2] === Clients * OpenBSD 6.2 with OpenVPN 2.4.4 * GNU/Linux Gentoo with OpenVPN 2.4.4 * LinesageOS 14.1 with OpenVPN for Android 0.6.73 == Detailed Problem Description / Test Results Please note: the following documented tests used one and the same client / network connection: * GNU/Linux Gentoo with OpenVPN 2.4.4 * Connected to router via wifi on internet connection with max 50Mbit/s download To rule out problems with the client local network settings tests with other client setups on other networks were also performed and showed identical results. For brevity they are not documented here. === Case 1: Server <==> WAN (ok) * When on the server, downloading a file from WAN * Scenario: downloaded 100MB file from http://fra36-speedtest-1.tele2.net/ with curl * Average Download Speed: ~ 10Mbit/s * Testresult: ---- $ curl http://fra36-speedtest-1.tele2.net/100MB.zip > /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 100M 100 100M 0 0 9309k 0 0:00:11 0:00:11 --:--:-- 10.9M ---- === Case 2: Client <= VPN => Server (ok) * When on the client, downloading a file from server via VPN tunnel * Scenario: standard download test with `iperf` * Average Download Speed: ~ 15Mbit/s * Testresult: ---- # iperf -s --- Server listening on TCP port 5001 TCP window size: 16.0 KByte (default) --- [ 4] local 10.8.0.1 port 5001 connected with 10.8.0.4 port 34998 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.2 sec 18.5 MBytes 15.2 Mbits/sec # iperf -c 10.8.0.1 --- Client connecting to 10.8.0.1, TCP port 5001 TCP window size: 45.0 KByte (default) --- [ 3] local 10.8.0.4 port 34998 connected with 10.8.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 18.5 MBytes 15.5 Mbits/sec ---- === Case 3a: Client <= VPN => Server <==> WAN (broken) * When on the client, downloading a file from WAN via VPN tunnel * Scenario: downloaded 100MB file from http://fra36-speedtest-1.tele2.net/ with curl * Average Download Speed: ~ 5KB/s * Testresult: ---- curl http://fra36-speedtest-1.tele2.net/100MB.zip > /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 100M 0 149k 0 0 5102 0 5:42:32 0:00:30 5:42:02 4933 ---- === Case 3b: Client <==> WAN (ok) * When on the client, downloading a file from WAN directly * Scenario: downloaded 100MB file from http://fra36-speedtest-1.tele2.net/ with curl * Average Download Speed: ~ 1100KB/s * Testresult: ---- curl http://fra36-speedtest-1.tele2.net/100MB.zip > /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 100M 100 100M 0 0 1113k 0 0:01:32 0:01:32 --:--:-- 1196k ---- == Previous working system Before the upgrade to OpenBSD 6.2 we had a working system with the following setup: * OpenBSD 6.1 / i386 * OpenVPN 2.4.1 * firewall settings were the same [8] The fact that we had installed i386 instead of amd64 was unintentional. We had to change the virtual machine (QEMU) network interface from Realtek to Virtio to get a good performance on the external network interface. Hence the working system's external interface was operating on `vio`. The following system message buffer still lists the inefficient `re` device. * System Message Buffer [3] == Appendix * [0] https://www.xenproject.org/ * [1] Firewall configuration: /etc/pf.conf ---- ext_if="xnf0" vpn_if="tun0" vpn_ip="10.8.0.1" vpn_sn="10.8.0.0/24" server="10.8.0.99" ssh_port="22" vpn_port="1094" iperf_port="5001" server_tcp_ip4_ports="{ 25, 53, 80, 443, 465, 587, 993, 5222, 5269, 9999 }" server_udp_ip4_ports="{ 53, 5353, 67 }" # Runtime Options set block-policy return set loginterface egress set skip on lo #block log all match in all scrub (no-df max-mss 1440 random-id) # forwarding from WAN through tunnel to client pass in quick on $ext_if proto { tcp } from any to ($ext_if) port $server_tcp_ip4_ports rdr-to $server pass in quick on $ext_if proto { udp } from any to ($ext_if) port $server_udp_ip4_ports rdr-to $server # route outwards from tunnel pass out quick on $ext_if from $vpn_sn to any nat-to ($ext_if) # incoming pass in quick on $ext_if proto { tcp } from any to ($ext_if) port { $ssh_port $iperf_port } flags S/SA synproxy state pass in quick on $ext_if proto { udp } from any to ($ext_if) port { $ssh_port $vpn_port $iperf_port } block drop in quick on $ext_if all # out to WAN pass out quick on $ext_if from ($ext_if) to any modulate state block drop out quick on $ext_if all ---- * [2] system message buffer 6.2: ---- openBSD 6.2 (GENERIC) #0: Thu Oct 12 19:16:36 CEST 2017 r...@syspatch-62-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 2122313728 (2023MB) avail mem = 2051125248 (1956MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xfc001000 (11 entries) bios0: vendor Xen version "4.9.0" date 09/10/2017 bios0: Xen HVM domU acpi0 at bios0: rev 2 acpi0: sleep states S3 S4 S5 acpi0: tables DSDT FACP APIC HPET WAET SSDT SSDT acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 32 bits acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat ioapic0 at mainbus0: apid 1 pa 0xfec00000, version 11, 48 pins , remapped to apid 1 cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz, 2100.27 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,FSGSBASE,SMEP,ERMS cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz acpihpet0 at acpi0: 62500000 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0: C1(@1 halt!) "PNP0F13" at acpi0 not configured "PNP0700" at acpi0 not configured "ACPI0007" at acpi0 not configured pvbus0 at mainbus0: Xen 4.9 xen0 at pvbus0: features 0x2705, 32 grant table frames, event channel 1 xbf0 at xen0 backend 0 channel 5: disk scsibus1 at xbf0: 2 targets sd0 at scsibus1 targ 0 lun 0: <Xen, phy xvda 51712, 0000> SCSI3 0/direct fixed sd0: 51200MB, 512 bytes/sector, 104857600 sectors xbf1 at xen0 backend 0 channel 6: cdrom scsibus2 at xbf1: 2 targets cd0 at scsibus2 targ 0 lun 0: <Xen, qdisk xvdc 5174, 0000> SCSI3 5/cdrom fixed "vkbd" at xen0: device/vkbd/0 not configured xnf0 at xen0 backend 0 channel 7: address 00:50:56:34:10:49 pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02 pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00 pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility pciide0: channel 0 disabled (no drives) atapiscsi0 at pciide0 channel 1 drive 0 scsibus3 at atapiscsi0: 2 targets cd1 at scsibus3 targ 0 lun 0: <QEMU, QEMU DVD-ROM, 2.5+> ATAPI 5/cdrom removable cd1(pciide0:1:0): using PIO mode 4, DMA mode 2 uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 1 int 23 piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: SMBus disabled xspd0 at pci0 dev 2 function 0 "XenSource Platform Device" rev 0x01 vga1 at pci0 dev 3 function 0 "Cirrus Logic CL-GD5446" rev 0x00 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) isa0 at pcib0 isadma0 at isa0 fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pckbc0 at isa0 port 0x60/5 irq 1 irq 12 pckbd0 at pckbc0 (kbd slot) wskbd0 at pckbd0: console keyboard, using wsdisplay0 pms0 at pckbc0 (aux slot) wsmouse0 at pms0 mux 0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 usb0 at uhci0: USB revision 1.0 uhub0 at usb0 configuration 1 interface 0 "Intel UHCI root hub" rev 1.00/1.00 addr 1 uhidev0 at uhub0 port 1 configuration 1 interface 0 "QEMU QEMU USB Tablet" rev 2.00/0.00 addr 2 uhidev0: iclass 3/0 ums0 at uhidev0: 3 buttons, Z dir wsmouse1 at ums0 mux 0 vscsi0 at root scsibus4 at vscsi0: 256 targets softraid0 at root scsibus5 at softraid0: 256 targets root on sd0a (244889b124e5edd0.a) swap on sd0b dump on sd0b fd0 at fdc0 drive 1: density unknown ---- * [3] Working system message buffer before upgrade from 6.1 to 6.2 ---- OpenBSD 6.1 (GENERIC) #291: Sat Apr 1 13:49:08 MDT 2017 dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz ("GenuineIntel" 686-class) 2.11 GHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,NXE,PAGE1GB,LONG,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,LAHF,FSGSBASE,SMEP,ERMS real mem = 2138583040 (2039MB) avail mem = 2084909056 (1988MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: date 06/23/99, BIOS32 rev. 0 @ 0xfd578, SMBIOS rev. 2.4 @ 0xfc001000 (11 entries) bios0: vendor Xen version "4.9.0" date 09/10/2017 bios0: Xen HVM domU acpi0 at bios0: rev 2 acpi0: sleep states S3 S4 S5 acpi0: tables DSDT FACP APIC HPET WAET SSDT SSDT acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 32 bits acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat ioapic0 at mainbus0: apid 1 pa 0xfec00000, version 11, 48 pins cpu0 at mainbus0: apid 0 (boot processor) mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 100MHz acpihpet0 at acpi0: 62500000 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpicpu0 at acpi0: C1(@1 halt!) "PNP0F13" at acpi0 not configured "PNP0303" at acpi0 not configured "PNP0700" at acpi0 not configured "PNP0501" at acpi0 not configured "ACPI0007" at acpi0 not configured bios0: ROM list: 0xc0000/0x9600 0xc9800/0xe00 0xec000/0x4000! pvbus0 at mainbus0: Xen 4.9 pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02 pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00 pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: <QEMU HARDDISK> wd0: 16-sector PIO, LBA48, 51200MB, 104857600 sectors wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 atapiscsi0 at pciide0 channel 1 drive 0 scsibus1 at atapiscsi0: 2 targets cd0 at scsibus1 targ 0 lun 0: <QEMU, QEMU DVD-ROM, 2.5+> ATAPI 5/cdrom removable cd0(pciide0:1:0): using PIO mode 4, DMA mode 2 uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 1 int 23 piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: SMBus disabled "XenSource Platform Device" rev 0x01 at pci0 dev 2 function 0 not configured vga1 at pci0 dev 3 function 0 "Cirrus Logic CL-GD5446" rev 0x00 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) re0 at pci0 dev 4 function 0 "Realtek 8139" rev 0x20: RTL8139C+ (0x7480), apic 1 int 32, address 00:50:56:34:10:49 rlphy0 at re0 phy 0: RTL internal PHY isa0 at pcib0 isadma0 at isa0 fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 fd0 at fdc0 drive 1: density unknown com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pckbc0 at isa0 port 0x60/5 irq 1 irq 12 pckbd0 at pckbc0 (kbd slot) wskbd0 at pckbd0: console keyboard, using wsdisplay0 pms0 at pckbc0 (aux slot) wsmouse0 at pms0 mux 0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 usb0 at uhci0: USB revision 1.0 uhub0 at usb0 configuration 1 interface 0 "Intel UHCI root hub" rev 1.00/1.00 addr 1 nvram: invalid checksum uhidev0 at uhub0 port 1 configuration 1 interface 0 "QEMU QEMU USB Tablet" rev 2.00/0.00 addr 2 uhidev0: iclass 3/0 ums0 at uhidev0: 3 buttons, Z dir wsmouse1 at ums0 mux 0 vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on wd0a (244889b124e5edd0.a) swap on wd0b dump on wd0b clock: unknown CMOS layout ---- * [4] https://www.openbsd.org/62.html - search for "Generic network stack improvements"