On Wed, Oct 01, 2008 at 07:41:56AM -0400, Stephen Clark wrote: > Hello List, > > I am running into a strange problem that points to a resource leak. The > problem manifests itself after one of our remote systems has been up > around 100 days. > The symptom is that it appears no new processes can be spawned. If I try to > ssh to the unit, I can see the 3-way tcp handshake and then no more traffic. > Examining log files, like cron, etc show that when this happens no more > entries > are written into the cron log. The unit is acting as a firewall, router > and vpn appliance these functions continue to work. We have a C > application that is periodically started out of a shell script that > reports various information about the system, it stops reporting, while > vpns, ospf routing, and ipfilter firewalling continue to work and write > into their logfiles. > > My question is how do I monitor the various resources in the system that could > prevent the spawning of a new process?
Periodically logging "ps -auxw" output to a file would be useful, as ideally you'd gradually see the list get longer and longer over time; it's possible you have many zombie processes as a result of a parent which is not reaping its children (calling waitpid(2) or its friends). Other things that might come in useful are "fstat" and "vmstat -s". It sounds like your C program relies heavily on system() or execl() and fork(), which is why it's affected -- while the other programs are likely kernel-level. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"