Hi, This morning one of our R&D servers stop responding (no ssh, http) and because of urgency of some tests I needed to hardware-reset it. After machine woke up, I first checked /var/log/messages:
May 30 06:25:05 arge syslogd 1.4.1#18: restart. May 30 06:49:46 arge -- MARK -- May 30 07:09:46 arge -- MARK -- May 30 07:29:47 arge -- MARK -- May 30 07:49:47 arge -- MARK -- May 30 08:09:47 arge -- MARK -- May 30 08:29:47 arge -- MARK -- May 30 08:44:36 arge kernel: e100: eth1: e100_watchdog: link down May 30 08:44:38 arge kernel: e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex May 30 08:44:40 arge kernel: e100: eth1: e100_watchdog: link down May 30 08:44:42 arge kernel: e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex May 30 08:45:14 arge shutdown[7450]: shutting down for system halt May 30 08:38:11 arge syslogd 1.4.1#18: restart. May 30 08:38:11 arge kernel: klogd 1.4.1#18, log source = /proc/kmsg started. May 30 08:38:11 arge kernel: Linux version 2.6.18-6-686 (Debian 2.6.18.dfsg.1-18etch5) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Sat May 24 10:24:42 UTC 2008 As can be understood from "kernel: e100: eth1: ..." lines, I first suspected a connection failure and try to fiddle with the network cable socket. But logs tell that it wasn't the problem. Moreover, it seems that system was working properly just before 08:44:36 if we'd look at /var/log/syslog May 30 08:40:01 arge /USR/SBIN/CRON[6611]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi) May 30 08:40:01 arge /USR/SBIN/CRON[6614]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi) May 30 08:41:01 arge /USR/SBIN/CRON[6630]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi) May 30 08:41:01 arge /USR/SBIN/CRON[6632]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi) May 30 08:42:01 arge /USR/SBIN/CRON[6654]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi) May 30 08:42:01 arge /USR/SBIN/CRON[6655]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi) May 30 08:43:01 arge /USR/SBIN/CRON[7039]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi) May 30 08:43:01 arge /USR/SBIN/CRON[7040]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi) May 30 08:44:01 arge /USR/SBIN/CRON[7417]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi) May 30 08:44:01 arge /USR/SBIN/CRON[7420]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi) I checked logs of every file under /var/log at time between 08:00:00 and 08:38:00, but found nothing useful. OTOH, if we'd look at below lines of the /var/log/messages output: May 30 08:45:14 arge shutdown[7450]: shutting down for system halt May 30 08:38:11 arge syslogd 1.4.1#18: restart. It seems that openntpd somehow failed to synchronize hardware clock with the time it gathered from NTP servers, and after reboot it switched back to a past time. Is this something expected? If not, how can I fix this? To summarize, what else should I check to figure out the reason of the emerged problem? (I'll try to login from terminal next time such a failure repeats.) Regards. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]