Forcibly disabling RSS with the IPSec deferred patch seems to have fixed the 
issue.  Given the wide ranging deleterious effects with RSS on vs. a bit of 
IPsec theoretical maximum bandwidth loss with it off, we'll take the bandwidth 
hit at the moment. ;)

Are there any significant concerns with running the patch for deferred IPSec 
input?  From my analysis of the code, I think the absolute worst case might be 
a reordered packet or two, but that's always possible with IPSec over UDP 
transport AFAIK.

----- Original Message -----
> From: "Timothy Pearson" <tpear...@raptorengineeringinc.com>
> To: "freebsd-net" <freebsd-net@FreeBSD.org>
> Sent: Saturday, January 18, 2025 7:24:57 PM
> Subject: Re: FreeBSD 13: IPSec netisr overload causes unrelated packet loss

> Quick update --tried the IPSec deferred update patch [1], no change.
> 
> A few tunables I forgot to include as well:
> net.route.netisr_maxqlen: 256
> net.isr.numthreads: 32
> net.isr.maxprot: 16
> net.isr.defaultqlimit: 256
> net.isr.maxqlimit: 10240
> net.isr.bindthreads: 1
> net.isr.maxthreads: 32
> net.isr.dispatch: direct
> 
> [1] https://www.mail-archive.com/freebsd-net@freebsd.org/msg64742.html
> 
> ----- Original Message -----
>> From: "Timothy Pearson" <tpear...@raptorengineeringinc.com>
>> To: "freebsd-net" <freebsd-net@FreeBSD.org>
>> Sent: Saturday, January 18, 2025 4:16:29 PM
>> Subject: FreeBSD 13: IPSec netisr overload causes unrelated packet loss
> 
>> Hi all,
>> 
>> I've been pulling my hair out over a rather interesting problem that I've 
>> traced
>> into an interaction between IPSec and the rest of the network stack.  I'm not
>> sure if this is a bug or if there's a tunable I'm missing somewhere, so here
>> goes...
>> 
>> We have a pf-based multi-CPU firewall running FreeBSD 13.x with multiple 
>> subnets
>> directly attached, one per NIC, as well as multiple IPSec tunnels to remote
>> sites alongside a UDP multicast proxy system (this becomes important later).
>> For the most part the setup works very well, however we have discovered
>> through extensive trial and error / debugging that we can induce major packet
>> loss on the firewall host itself by simply flooding the system with small 
>> IPSec
>> packets (high PPS, low bandwidth).
>> 
>> The aforementioned (custom) multicast UDP proxy is an excellent canary for 
>> the
>> problem, as it checks for and reports any dropped packets in the receive data
>> stream.  Normally, there are no dropped packets even with saturated links on
>> any of the local interfaces or when *sending* high packet rates over IPsec.  
>> As
>> soon as high packet rates are *received* over IPsec, the following happens:
>> 
>> 1.) netisr on one core only goes to 100% interrupt load
>> 2.) net.inet.ip.intr_queue_drops starts incrementing rapidly
>> 3.) The multicast receiver, which only receives traffic from one of the 
>> *local*
>> interfaces (not any of the IPsec tunnels), begins to see packet loss despite
>> more than adequate buffers in place with no buffer overflows in the UDP 
>> stack /
>> application buffering.  The packets are simply never received by the kernel 
>> UDP
>> stack.
>> 4.) Other applications (e.g. NTP) start to see sporadic packet loss as well,
>> again on local traffic not over IPsec.
>> 
>> As soon as the IPSec receive traffic is lowered enough to get the netisr
>> interrupt load below 100% on the one CPU core, everything recovers and
>> functions normally.  Note this has to be done by lowering the IPSec transmit
>> rate on the remote system, there is no way I have discovered to "protect" the
>> receiver from this kind of overload.
>> 
>> While I would expect packet loss in an overloaded IPSec link scenario like 
>> this
>> just due to the decryption not keeping up, I would also expect that loss to 
>> be
>> confined to the IPSec tunnel.  It should not spider out into the rest of the
>> system and start affecting all of the other applications and
>> routing/firewalling on the box -- this is what was miserable to debug, as the
>> IPSec link was originally only hitting the PPS limits described above
>> sporadically during overnight batch processing.  Now that I know what's going
>> on, I can provoke easily with iperf3 in UDP mode.  On the boxes we are using,
>> the limit seems to be around 50kPPS before we hit 100% netisr CPU load -- 
>> this
>> limit is *much* lower with async crypto turned off.
>> 
>> Important tunables already set:
>> 
>> net.inet.ipsec.async_crypto=1 (turning this off just makes the symptoms 
>> appear
>> at lower PPS rates)
>> net.isr.dispatch=direct (deferred or hybrid does nothing to change the 
>> symptoms)
>> net.inet.ip.intr_queue_maxlen=4096
>> 
>> Thoughts are welcome...if there's any way to stop the "spread" of the loss 
>> I'm
>> all ears.  It seems that somehow the IPSec traffic (perhaps by nature of its
>> lengthy decryption process) is able to grab an unfair share of netisr queue 
>> 0,
>> and that interferes with the other traffic.  If there was a way to move the
>> IPSec decryption to another netisr queue, that might fix the problem, but I
>> don't see any tunables to do so.
>> 
> > Thanks!

Reply via email to