On Sat, 10 Nov 2007, Ed Mandy wrote:

If kern.ipc.nmbclusters is set to 25600, the system will hard freeze when "vmstat -z" shows the number of clusters reaches 25600. If kern.ipc.nmbclusters is set to 0 (or 102400), the system will hard freeze when "vmstat -z" shows the number of clusters is around 66000. When it freezes, the number of Kbytes allocated to network (as shown by "netstat -m") is roughly 160,000 (160MB).

For a while, we thought that there may be a limit of 65536 mbuf clusters, so we tested building the kernel with MCLSHIFT=12, which makes each mbcluster 4096-bytes. With this configuration, nmbclusters only reached about 33000 before the system froze. The number of Kbytes allocated to network (as shown by "netstat -m") still maxed out at around 160,000.

Now, it seems that we are running into some other memory limitation that occurs when our network allocation gets close to 160MB. We have tried tuning paramaters such as KVA_PAGES, vm.kmem_size, vm.kmem_size_max, etc. Though, we are unsure if the mods we made there helped in any way.

This is all being done on Celeron 2.8GHz machines with 3+ GB of RAM running FreeBSD 5.3. We are very much tied to this platform at the moment, and upgrading is not a realistic option for us. We would like to tune the systems to not lockup. We can currently work around the problem (by using smaller buffers and such), but it is at the expense of network throughput, which is less than ideal.

Are there any other parameters that would help us to allocate more memory to the kernel networking? What other options should we look into?

I'd like to diagnose "freeze hard" a little more to understand what's going on. Hopefully this won't be too disruptive for your environment while you're doing it.

First off, can you tell me how you're accessing the system to run diagnostic tools, monitor it, etc? Remember that if you run out of clusters, you may experience network deadlocks that prevent SSH sessions from operating (since there may be no memory for them to operate), so direct console access may be required to effectively monitor the system when in an extreme state of low memory in the network stack. Could you tell me if you are using a serial console or the video console? (Or firewire, I suppose?)

FreeBSD 5.3 was the first release to include an MPSAFE network stack, and there were a number of optionally compiled features that could disable MPSAFE networking, resulting in the Giant lock being held over network operations. Could you tell me what the value of the sysctl debug.mpsafenet is?

When the system appears to hard hang, does it recover if, say, left five minutes? What if you unplug the network cable and leave it five minutes?

Does the numlock key on the console work? If you leave the console logged in and running an application (such as "sleep 100000") and the system hangs, what do you see if you hit Ctrl-T?

If you compile options BREAK_TO_DEBUGGER into the kernel and generate a serial break / hit ctrl-alt-esc, are you able to get into the debugger? If you type in "trace", what do you get? (There is a chapter of the developer's handbook that talks about using the kernel debugger, FYI). With 5.3, we found that usig a serial console to get to the debugger was a lot more reliable than the video console -- this is in part because a significant amount of the kernel (especially file systems and the video console) still run under Giant, so a thread hanging while holding Giant can prevent a console break from getting to the debugger. My advice would be to use a serial console anyway, if possible, when debugging, as it means you can use a second machine to copy and paste DDB output into a file to e-mail out later. After about the third line of a kernel stack trace, copying addresses out by hand becomes pretty painful :-).

Unfortunately, I have to say that my first advice would be to upgrade -- not just because a lot of work has been done relating to network stack performance and stability since 5.3, but also because the debugging tools have gotten a lot better since then. For example, in more recent versions the kernel debugging includes memory monitoring tools, commands to more readily extract debugging information, etc. 5.3 is a solid and functional release, but when it comes to debugging problems of this sort, being on a more recent release means you're more likely to see the problem already fixed, and even if not, it will be easier for us to fix it. I understand that may simply not be possible, but if you have that flexibility, it's good advice.

A general comment on configuration: increasing the maximum memory allocated to the network stack can indeed increase your KVA usage significantly. You might well find that tuning KVA up is required to run with very high memory configurations for the network stack, so your intuitions about tuning that up aren't bad. However, when you run out of KVA, the result is usually a panic (since the kernel basically has to halt), so if you're not seeing a panic then you're probably not yet hitting the limit.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to