On Wed, 16 Sep 2009 13:01:45 +0200 Janne Johansson <j...@it.su.se> wrote:
> paranoid.gand...@googlemail.com wrote: > > > > The OS got totaly corrupted. > > gdb, su, sudo do segfault for example. > > 8< > > > But later my ssh died again and after that the server finaly > > broke down. Beyond the point of what fsck can handle. > > During auto-fsck the box reboots. > > > > A good bug I'd say... ran into it now 2 times in less then > > 5 hours. And I have no clue why or how I triggered it. > > 8< > > > If there is more I could tell you please do let me know. > > Any of the "My computer has bad hardware" tips seem to apply nicely to > this kind of symptoms. It was my first asumption! Would it be BAD memory: The recover process like descriped below might would have failed. If it would be a HDD issue I might would have faced the problems from the beginning on. After the server first rebooted we where able to login but then (after a while) ssh stoped accapting connections (http still worked again...), physical login was not possible anymore again and right after this the programs started to segfault like hell. Everything was alright minutes before and Iused su and sudo too. I'll let the RAM replace anyway but during a check it was alright. The only faulthy thing found was a PSU not delivering straight 5V on a 5V line. It get replaced. The HW will get checked if some data where copied from the HDD. I don't tell secrets but in case other ppl. have similiar problems some day (no matter why): The server was recovered using a OpenBSD 4.5-cd. 1. boot cd 2. use (S)hell 3. use fsck now (except of the auto-fsck from the broken box) 4. use the "update" command to regain valid binaries 5. reboot + recover your data 6. do further stuff Regards, Gandalf