Hello,

thanks for the suggestion. Running pmtu.sh with kernel versions 4.19, 4.20 and 
even 5.2.13 made no difference. All tests were successful every time.

Although my external ping tests still failing with the newer kernels. I've ran 
the script after triggering my problem, to make sure all possible side effects
happening. 

Please keep in mind, that even when the ICMP requests stalling, other 
connections still going through. Like e.g. ssh or tracepath. I would expect that
all connection types would be affected if this is a MTU problem. Am I wrong?

Any suggestions for more tests to isolate the cause? 

Best regards,
--
Thomas Bartschies
CVK IT Systeme

-----Ursprüngliche Nachricht-----
Von: David Ahern [mailto:dsah...@gmail.com] 
Gesendet: Freitag, 13. September 2019 19:13
An: Bartschies, Thomas <thomas.bartsch...@cvk.de>; 'netdev@vger.kernel.org' 
<netdev@vger.kernel.org>
Betreff: Re: big ICMP requests get disrupted on IPSec tunnel activation

On 9/13/19 9:59 AM, Bartschies, Thomas wrote:
> Hello together,
> 
> since kenel 4.20 we're observing a strange behaviour when sending big ICMP 
> packets. An example is a packet size of 3000 bytes.
> The packets should be forwarded by a linux gateway (firewall) having multiple 
> interfaces also acting as a vpn gateway.
> 
> Test steps:
> 1. Disabled all iptables rules
> 2. Enabled the VPN IPSec Policies.
> 3. Start a ping with packet size (e.g. 3000 bytes) from a client in 
> the DMZ passing the machine targeting another LAN machine 4. Ping 
> works 5. Enable a VPN policy by sending pings from the gateway to a 
> tunnel target. System tries to create the tunnel 6. Ping from 3. immediately 
> stalls. No error messages. Just stops.
> 7. Stop Ping from 3. Start another without packet size parameter. Stalls also.
> 
> Result:
> Connections from the client to other services on the LAN machine still 
> work. Tracepath works. Only ICMP requests do not pass the gateway 
> anymore. tcpdump sees them on incoming interface, but not on the outgoing LAN 
> interface. IMCP requests to any other target IP address in LAN still work. 
> Until one uses a bigger packet size. Then these alternative connections stall 
> also.
> 
> Flushing the policy table has no effect. Flushing the conntrack table has no 
> effect. Setting rp_filter to loose (2) has no effect.
> Flush the route cache has no effect.
> 
> Only a reboot of the gateway restores normal behavior.
> 
> What can be the cause? Is this a networking bug?
> 

some of these most likely will fail due to other reasons, but can you run 
'tools/testing/selftests/net/pmtu.sh'[1] on 4.19 and then 4.20 and compare 
results. Hopefully it will shed some light on the problem and can be used to 
bisect to a commit that caused the regression.


[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/net/pmtu.sh

Reply via email to