On 10/31/14 09:39, Stuart Henderson wrote:
On 2014-10-30, Federico Giannici <giann...@neomedia.it> wrote:
Hi.
We noticed that in our firewall (an OpenBSD 5.5-stable amd64, demsg
follows) there are a large number of network livelocks
(kern.netlivelocks). We graphed them and noticed that they arrive even
to 2000 per minute! They are related to the amount of traffic but not in
a linear way.
We'd like to know if this is expected and "normal" or we have to worry
about them and find what's wrong.
The PC is a firewall with a large number of queues and up to 500 Mbps of
traffic.
Thanks.
Here is the dmesg. NMFW is GENERIC (no MP) and only change is HZ=1000
(for queues accuracy).
This is expected if you increase HZ without changing how livelock
detection works. It sets a timer every clock tick, when that timer has
triggered it checks how many ticks elapsed, if >1 livelock is detected.
This triggers livelock avoidance which will slow down your network
traffic so yes you do want to pay attention to them. sys/net/if.c
Thank you for your reply.
Unfortunately I haven't enough expertise in kernel programming to really
understand where is the problem.
However, are you saying that the kernel doesn't correctly handle the
case where HZ != 100?
So, is this a kernel bug?
Also note, if you're graphing by calling sysctl(8) that may be locking
the kernel for long enough to trigger livelock detection!
I read the sysctl value only once per five minutes, so this is not a
problem.
Thanks.