--On 23 June 2005 02:50 -0700, Winston Williams wrote:
Do any of you have any ideas for what I could try to either test out
this fork failure theory, or other suggestions for what might be
causing my problem?
Some possibilities come to mind,
too many open files (or login.conf limits set too low).
out of memory.
Setup some monitoring of this type of thing - maybe have cron email you
'sysctl kern.nfiles' and 'fstat' regularly so you can see what happens
in the run-up to the system becoming unresponsive, though obviously if
you can't spawn processes this will stop working ... keep some sessions
open running 'top' and 'vmstat 30' or so, which might help you identify
what's happening.
Since you have no access to the system at the time it fails, you need
to examine what happens in the run-up and look for clues.