Re: FreeBSD unstable on Dell 1750 using SMP?

Dan Charrois Thu, 24 Nov 2005 15:48:08 -0800

Hi Kris, Rutger, and others that have commented on this thread.

I'm happy to hear that I'm not the only one experiencing problemslike this. I posted a similar question a month or so ago about aPowerEdge 2850 using SMP (dual Xeons) and never received anyresponses that helped solve the problem, or even any indication thatothers had the same problem. As you know, troubleshooting this isquite difficult, since it can take weeks to go down, and then the"auto-reboot" doesn't result in any clues as to why in the log file -it's just suddenly started again as if someone had pulled the plug onit. I've been pulling my hair out.

My machine crashed twice in the last month or so, within two weeks ofeach other. Both times, it was just as a cron task was about toschedule the mysqlhotcopy script to back up some SQL databases thatare being hosted on that machine, so I thought it may have somethingto do with that (I had it running as a root crontask so figured thatmaybe some bug in that caused things to go weird - it was running asroot, after all). I changed it to run under a less privileged userand the machine hasn't died for about 2 1/2 weeks. But that's hardlya conclusive case of having solved the situation - it's probablyplanning on surviving just long enough to last until the point I needit the most to work. It sounds as though memory buffer allocationsare going wacky or something, in which anything could take it downgiven the wrong combination of events.

In any case, We're running the amd64 version of FreeBSD 5.4-RELEASE-p6 FreeBSD 5.4-RELEASE-p6 #3: Fri Aug 5 18:18:10 MDT 2005

A netstat -m (which I'd never tried before) yields:

18446744073709551402 mbufs in use
49/25600 mbuf clusters in use (current/max)
0/0/0 sfbufs in use (current/peak/max)
44 KBytes allocated to network
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
884 calls to protocol drain routines

Obviously, the mbufs in use currently on that machine is way out tolunch. And interestingly, it looks as though my max mbuf clusters inuse of 25600 is identical to the other netstat -m reports from peoplehaving this problem.

Another machine (an older single CPU Dell) on which I'm running the386 version of FreeBSD 5.4-RELEASE-p5 FreeBSD 5.4-RELEASE-p5 #1: ThuJul 21 22:30:46 MDT 2005 has a more sane netstat -m:

130 mbufs in use
128/8896 mbuf clusters in use (current/max)
0/177/2480 sfbufs in use (current/peak/max)
288 KBytes allocated to network
0 requests for sfbufs denied
0 requests for sfbufs delayed
208493 requests for I/O initiated by sendfile
26697 calls to protocol drain routines

But here's about where any troubleshooting on my own reaches itslimit. I noticed that Kris mentioned it was a known problem in thestats counting for SMP machines and had been fixed, but haven't beenable to find a reference to that, or any indication of how to do so.Is this fix supposed to have been an accounting bug in the report fornetstat, or is it something which would have taken down the machineas has been happening?

If switching to single CPU mode works, it's good to hear that I havean option if things continue to act up. But I'd really rather nothave to "dumb down" the machine to one CPU when there is thepotential of two. Most of the time it's not under a huge load, butperiodically there are massive spikes, and that's where having twoCPUs really help.

If anyone can shed further light on a fix for this problem, it wouldbe greatly appreciated!

Dan
--
Syzygy Research & Technology
Box 83, Legal, AB  T0G 1L0 Canada
Phone: 780-961-2213

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: FreeBSD unstable on Dell 1750 using SMP?

Reply via email to