On Thu, 23 Jun 2005 02:50:06 -0700, Winston Williams wrote:

>This is a continuation of my 'sshd suddenly not responding' message from
>Tuesday.
>
>I still haven't resolved the problems on this machine.  I had to have
>someone at the data center reboot the machine so that I could get back
>in over ssh.  After they rebooted the machine, I was able to work for
>about 20 minutes before the ssh session (and sshd) died again.  I
>put /sbin/reboot in the crontab and tested it, and the machine rebooted.
>I left that in the crontab to run hourly, and I also put in another
>entry to kill and restart sshd every 30 minutes.  I also let that run
>and it worked.  I stopped qmail and I disabled pf but I left apache
>running.  
>
>After that 20 minutes or so, my ssh session died unexpectedly, and when
>I went to reconnect, the socket opens on that port but then it just sits
>forever.  It never shows the OpenSSH banner and nothing further happens.
>Apache is still running and working fine.  Here is where it gets really
>strange... The crontab for reboot does not run now, and neither does the
>crontab to restart ssh.  I know it is not rebooting because I run hping
>and it never has an interruption.  I now suspect that the machine is
>unable to fork new processes.
>
>Here are the results of some tests that I have run:
>
>1-When I connect via SSH, the socket connects but then just sits before
>any data is sent.  I suspect that the main process listens and accepts
>the connection, but then tries to fork a new process and fails.
>
>2-named is still running and seems to be working fine
>
>3-Nothing on cron seems to run at this point.  I tested the entires in
>cron by letting them run while the system was operating normally, and
>they did work when the system was operating normally, like after a fresh
>reboot for that 20 minute or so window.  After that, the reboot never
>happens and I don't think it is killing and restarting sshd either
>
>4-Apache can still do it's thing.  I am assuming this is because it
>automatically starts a number of processes right away.  It has enough
>processes already running so that it does not need to fork when a new
>connection comes in.
>
>5-One other interesting thing to note is that /var/log/authlog was
>around 21,000 lines when I checked it.  The OS install is only about 5
>days old.  I moved ssh to a non-standard port to try to help reduce the
>random break-in attempts.
>
>I would really like to use OpenBSD on this machine.  If I can't figure
>it out in the next day or two, I will have to switch to another
>operating system.
>
>Do any of you have any ideas for what I could try to either test out
>this fork failure theory, or other suggestions for what might be causing
>my problem?  
>
>-- 
>Winston Williams <[EMAIL PROTECTED]>
>
>

It is not the operating system. I cannot reproduce your problem and I
have many machines running in the field on various old and new hardware
with 3.5 3.6 & 3.7 and a labrat here that usually runs current (for
some definition of current) for days at a time.

sshd is as reliable as can be. i.e. zero problems.

Look elsewhere.

Threatening the OS with replacement is not likely to shock it into
behaving any differently.

>From the land "down under": Australia.
Do we look <umop apisdn> from up over?

Do NOT CC me - I am subscribed to the list.
Replies to the sender address will fail except from the list-server.

Reply via email to