On Fri, Jan 19, 2018 at 03:45:46PM +0100, Tobias Hommel wrote: > > I tried to strip down the system configuration and was able to reproduce the > problem with a minimal configuration: > * ipsets are not used anymore > * no firewall markings are used any longer > * iptables are "completely empty", i.e. all policies set to ACCEPT and there > is > no rule in any table > * no additional routing policies (ip rule) except the default ones > * only main routing table is used > * using a "minimal" kernel config: > * run `make defconfig` > * add basic things (ESP, IGB driver, some crypto algorithms) > * add options required to boot up the system (TPM crypt, some device mapper > options, overlayfs) > > I attached the minimal config (minimal.config) and the defconfig for reference > (minimal.defconfig). > > The setup is really simple now, the gateway is forwarding HTTP connections > between eth1(IPSec tunnels) and eth0 without any firewall, NAT, whatsoever.
Thanks a lot for your debugging effort! > > The only thing I can think of are the rather aggressive roadwarrior clients. > There are 750 roadwarriors that are controlled by a script which starts and > stops the IPSec connection. I still can't reproduce it with my tests. This is probably some race triggered due to your aggressive roadwarrior setup which I don't have. > I tried 4.15-rc8 and have the same problem here (see attached > kernel-4.15-rc8.log). SMP affinity for IRQs has changed in 4.15 and > something's There is one patch that could influence this which is not in v4.15-rc8: commit 76a4201191814a0061cb5c861fafb9ecaa764846 ("xfrm: Fix a race in the xdst pcpu cache.") It is included in v4.15-rc9. If this does not fix your problem, I'm out of ideas. In this case I have to ask to do a bisection to find the offending commit.