Re: IP fragmentation performance and don't fragment bug when forwarding

Risto Pajula Sun, 02 Dec 2018 13:33:02 -0800

Hello.

You can most likely ignore the "DF Bit, mtu bug when forwarding" case.There isn't actually big IP packets on the wire, instead there is burstof packets on the wire, which are combined by the GRO... And thusdropping them should not happen. Sorry about the invalid bug report.

However the poor latency from intenal network to the internet stillremain, both GRO enabled and disabled. I will try to study further...



BR.
Risto


On 2.12.2018 14:01, Risto Pajula wrote:

Hello.
I have encountered a weird performance problem in Linux IPfragmentation when using video streaming services behind the NAT. AlsoI have studied a possible bug in the DF bit (don't fragment) handlingwhen forwarding the IP packets.
First the system setup description:

[host1]-int lan-(eth1)[linux router](eth0)-extlan-[fibre router]-internet

where:
host1: is a Netgem N7800 "cable box" for online video streamingservices provided by local telco (Can access Netflix, HBO nordic,"live TV", etc.)linux router: Linux computer with Dualcore Intel Celeron G1840,running currently Linux kernel 4.20.0-rc2, and openSUSE Leap 15.0eth1: Linux Routers internal (NAT) interface, 192.168.0.1/24 network,mtu set to 1500, RTL8169sb/8110sbeth0: Linux Routers internet facing interface, public ip address, mtuset to 1500, RTL8168evl/8111evlfibre router: Alcatel Lucent fibre router (I-241G-Q), directlyconnected to the eth0 of the Linux router.
And now when using the Netgem N7800 with online video services(Netflix, HBO nordic, etc) the Linux router will receive very BIG IPpackets in the eth0 upto ~20kB, this seems to lead to the followingproblems in the Linux IP stack.
IP fragmentation performance:
When the Linux router receives these large IP packets in the eth0everything works, but it seems that them cause very large performancedegradation from internal network to the internet regarding thelatency when the IP fragmentation is performed. The ping latency frominternal network to the internel network increases from stable15ms-20ms up to 700-800ms AND also the ping from the internal networkto the linux router eth1 (192.168.0.). However up link worksperfectly, the ping is still stable when streaming the online services(From linux router to the internet). It seems that the IPfragmentation is somehow blocking the eth1 reception or transmissionfor very long time (which it shouldn't). I'm able to test and debugthe issue further, but advice regarding where to look would beappreciated.
DF Bit, mtu bug when forwarding:
I have started to study the above mentioned problem and have found apossible bug in the DF bit and mtu handling in IP forwarding. The BIGpackets received from streaming services all have the "DF bit" set andthe question is that should we be forwarding them at all as that wouldresult them being fragmented? Apparently we currently are... I havetraced this down to the ip_forward.c function ip_exceeds_mtu(), andthe following patch seems to fix that.
--- net/ipv4/ip_forward.c.orig  2018-12-02 11:09:32.764320780 +0200
+++ net/ipv4/ip_forward.c       2018-12-02 12:53:25.031232347 +0200
@@ -49,7 +49,7 @@ static bool ip_exceeds_mtu(const struct
                return false;

        /* original fragment exceeds mtu and DF is set */
-       if (unlikely(IPCB(skb)->frag_max_size > mtu))
+        if (unlikely(skb->len > mtu))
                return true;

        if (skb->ignore_df)
This seems to work (in some ways) - after the change IP packets thatare too large to the internal network get dropped and we are sending"ICMP Destination unreachable, The datagram is too big" messages tothe originator (as we should?). However it seems that not all servicesreally like this... Netflix behaves as expected and ping is stablefrom internal network to the internet, but for example HBO nordic willnot work anymore (too little buffering? Retransimissions notworking?). So it seems the original issue should be also fixed (Andthe fragmention should be allowed?).
Any advice would be appreciated. Thanks!

PS. Watching TV was not this intensive 20 years ago :)

Re: IP fragmentation performance and don't fragment bug when forwarding

Reply via email to