Re: BUG: 4.14.11 unable to handle kernel NULL pointer dereference in xfrm_lookup

Tobias Hommel Tue, 12 Jun 2018 11:39:13 -0700

On Fri, Jun 08, 2018 at 10:41:37AM +0200, Kristian Evensen wrote:
> Hi,
> 
> On Wed, Jun 6, 2018 at 6:03 PM, Tobias Hommel <netdev-l...@genoetigt.de> 
> wrote:
> > Sorry no progress until now, I currently do not get time to have a deeper 
> > look
> > into that. We're back to 4.1.6 right now.
> 
> Thanks for letting me know. In the project I am currently involved in,
> we unfortunately don't have the option of reverting the kernel, so we
> are finding ways to live with the error. We have been looking into the
> error a bit more, and have made the following observations:
> 
> * First of all, as discussed earlier in the thread, the error is
> triggered by dst_orig being NULL. Our current work-around is just to
> return from xfrm_lookup if dst_orig is NULL and this seems to work
> fine, the error doesn't happen that often (in our use-cases at least).
> * The machine we use for testing (and where we first saw the error) is
> used as initiator.
The machine where I encountered the bug is a "roadwarrior gateway", so it only
serves as a responder.


> * When we compare the logs from Strongswan with the ones from the
> kernel, it seems that the error is typically triggered when a tunnels
> is teared down/about to come up. We need quite a lot of tunnels for
> the error to trigger, usually around 30+. I guess this might point to
> some race or some condition not being met when packets are
> sent/received.
> * We see the error much more frequently when hardware encryption is enabled.
> * Yesterday, we upgraded the kernel from 4.14.34 to 4.14.48, and the
> error happens much less frequently. I see that 4.14.48 includes
> several IPsec fixes (for example the previously mentioned ("xfrm: Fix
> a race in the xdst pcpu cache.")).
> 
> BR,
> Kristian

Re: BUG: 4.14.11 unable to handle kernel NULL pointer dereference in xfrm_lookup

Reply via email to