On Wed, Jan 24, 2018 at 10:59:21AM +0100, Steffen Klassert wrote:
> On Fri, Jan 19, 2018 at 03:45:46PM +0100, Tobias Hommel wrote:
> > 
> > I tried to strip down the system configuration and was able to reproduce the
> > problem with a minimal configuration:
> > * ipsets are not used anymore
> > * no firewall markings are used any longer
> > * iptables are "completely empty", i.e. all policies set to ACCEPT and 
> > there is
> >   no rule in any table
> > * no additional routing policies (ip rule) except the default ones
> > * only main routing table is used
> > * using a "minimal" kernel config:
> >  * run `make defconfig`
> >  * add basic things (ESP, IGB driver, some crypto algorithms)
> >  * add options required to boot up the system (TPM crypt, some device mapper
> >    options, overlayfs)
> > 
> > I attached the minimal config (minimal.config) and the defconfig for 
> > reference
> > (minimal.defconfig).
> > 
> > The setup is really simple now, the gateway is forwarding HTTP connections
> > between eth1(IPSec tunnels) and eth0 without any firewall, NAT, whatsoever.
> 
> Thanks a lot for your debugging effort!
> 
> > 
> > The only thing I can think of are the rather aggressive roadwarrior clients.
> > There are 750 roadwarriors that are controlled by a script which starts and
> > stops the IPSec connection.
> 
> I still can't reproduce it with my tests. This is probably some race
> triggered due to your aggressive roadwarrior setup which I don't have.
> 
> > I tried 4.15-rc8 and have the same problem here (see attached
> > kernel-4.15-rc8.log). SMP affinity for IRQs has changed in 4.15 and 
> > something's
> 
> There is one patch that could influence this which is not in v4.15-rc8:
> 
> commit 76a4201191814a0061cb5c861fafb9ecaa764846
> ("xfrm: Fix a race in the xdst pcpu cache.")
> 
> It is included in v4.15-rc9.
I already tested that one some weeks ago, when it appeared on the mailing list,
with 4.14. Without any luck.

> 
> If this does not fix your problem, I'm out of ideas. In this case
> I have to ask to do a bisection to find the offending commit.
> 
I'll do a bisect session then. It'll take some time though as the hardware is
currently occupied with other tests. I'll keep you up-to-date about the
results.

Reply via email to