On Tue, Jun 18, 2013 at 01:59:07PM +0100, Chris Purves wrote: > After upgrading to wheezy, I get a system hang every one or two days > where the system becomes completely unresponsive and I need do a > cold boot.
> This is an older machine with an Athlon processor. I'm not running > X. I don't see anything unusual in the logs. The last entry in > syslog is typically a cron job, but not always the same one. The > system seems to freeze without any warning. When the system freezes, is there anything useful on the console? If the kernel craps out, the result may not be visible in the log files (because things can halt before buffers are flushed etc). It may be useful to disable screen blanking on the console for this - the kernel may (or may not) wake up the console upon death. (I call that the JFK syndrome: He never knew what hit him). A couple of candidates spring to mind: * Overheating? If the system is old, it may be full of dust and thus the fans may struggle. Or the bearings get worn out. Insufficient airflow and cooling does tend to make things go pop - except for CPUs which (I believe) shut themselves down due to a built-in self-preservation instinct courtesy of the hardware engineers. * Struggling power supply? If the power supply is just barely providing enough power, random things which require more power may cause voltage drops that some component take a dislike to. Although the system *should* be consuming peak amount of power during power-on peaks may also occur later. * Bad RAM? (already covered in a different part of the thread) * Bad capacitors? Older motherboards are more likely to suffer from the capacitors going "pop". A web search for "Capacitor plague" is probably more reliable and informative than I can achieve in this email. > I tried downgrading the kernel back to the squeeze version (2.6) and > it still locks up. Before upgrading to wheezy I resized a few of the > partitions. Other than that, nothing else has changed and everything > had been running fine for years. Assuming that the resize was healthy, all should be ok. But... Since there are no clear suspects, paranoia dictates a run of fsck on the affected file systems. Just in case. At least it is a harmless check if you can afford the downtime while the file systems are unmounted. Hope this helps -- Karl E. Jorgensen -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/20130618153827.GB15076@hawking

