Jeremy Chadwick wrote:
On Wed, Oct 01, 2008 at 07:41:56AM -0400, Stephen Clark wrote:
Hello List,
I am running into a strange problem that points to a resource leak. The
problem manifests itself after one of our remote systems has been up
around 100 days.
The symptom is that it appears no new processes can be spawned. If I try to
ssh to the unit, I can see the 3-way tcp handshake and then no more traffic.
Examining log files, like cron, etc show that when this happens no more entries
are written into the cron log. The unit is acting as a firewall, router
and vpn appliance these functions continue to work. We have a C
application that is periodically started out of a shell script that
reports various information about the system, it stops reporting, while
vpns, ospf routing, and ipfilter firewalling continue to work and write
into their logfiles.
My question is how do I monitor the various resources in the system that could
prevent the spawning of a new process?
Periodically logging "ps -auxw" output to a file would be useful, as
ideally you'd gradually see the list get longer and longer over time;
it's possible you have many zombie processes as a result of a parent
which is not reaping its children (calling waitpid(2) or its friends).
Other things that might come in useful are "fstat" and "vmstat -s".
It sounds like your C program relies heavily on system() or execl() and
fork(), which is why it's affected -- while the other programs are
likely kernel-level.
Thanks Jeremy,
I have added those commands to a periodic daily script.
Another thing I have noticed is that quite often the problem seems to
start at 2am in the morning, right when the periodic daily script runs.
But I think it is coincidence and that we have reached the edge of the resource
limit and all the jobs that get spawned by the periodic daily scripts pushes us
over the limit.
The other thing is that having logged into some of the systems that have been up
in the 80 day range, I don't see a lot/any zombies. I just wonder if it is and
fd leak, the fstat should point that out.
Steve
--
"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety." (Ben Franklin)
"The course of history shows that as a government grows, liberty
decreases." (Thomas Jefferson)
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"