Hi Rafał,
On 19/04/04 (木) 21:57:15, Rafał Miłecki wrote:
Hello,
I'd like to report a regression that goes back to the 2015. I know it's
damn
late, but the good thing is, the regression is still easy to reproduce,
verify &
revert.
Long story short, starting with the commit 66e5133f19e9 ("vlan: Add GRO
support
for non hardware accelerated vlan") - which first hit kernel 4.2 - NAT
performance of my router dropped by 30% - 40%.
My hardware is BCM47094 SoC (dual core ARM) with integrated network
controller
and external BCM53012 switch.
Relevant setup:
* SoC network controller is wired to the hardware switch
* Switch passes 802.1q frames with VID 1 to four LAN ports
* Switch passes 802.1q frames with VID 2 to WAN port
* Linux does NAT for LAN (eth0.1) to WAN (eth0.2)
* Linux uses pfifo and "echo 2 > rps_cpus"
* Ryzen 5 PRO 2500U (x86_64) laptop connected to a LAN port
* Intel i7-2670QM laptop connected to a WAN port
* Speed of LAN to WAN measured using iperf & TCP over 10 minutes
1) 5.1.0-rc3
[ 6] 0.0-600.0 sec 39.9 GBytes 572 Mbits/sec
2) 5.1.0-rc3 + rtcache patch
[ 6] 0.0-600.0 sec 40.0 GBytes 572 Mbits/sec
3) 5.1.0-rc3 + disable GRO support
[ 6] 0.0-300.4 sec 27.5 GBytes 786 Mbits/sec
4) 5.1.0-rc3 + rtcache patch + disable GRO support
[ 6] 0.0-600.0 sec 65.6 GBytes 939 Mbits/sec
Did you test it with disabling GRO by ethtool -K?
Is this the result with your reverting patch?
It's late night in Japan so I think I will try to reproduce it tomorrow.
Thanks.
5) 4.1.15 + rtcache patch
934 Mb/s
6) 4.3.4 + rtcache patch
565 Mb/s
As you can see I can achieve a big performance gain by
disabling/reverting a
GRO support. Getting up to 65% faster NAT makes a huge difference and
ideally
I'd like to get that with upstream Linux code.
Could someone help me and check the reported commit/code, please? Is there
any other info I can provide or anything I can test for you?
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -545,6 +545,8 @@ static int __init vlan_offload_init(void)
{
unsigned int i;
+ return -ENOTSUPP;
+
for (i = 0; i < ARRAY_SIZE(vlan_packet_offloads); i++)
dev_add_offload(&vlan_packet_offloads[i]);