On Tue, Jun 06, 2000 at 10:41:09AM +1000, Marc-Adrian Napoli wrote > Hi all, as you can guess from the subject i have a server (debian 2.0, > pentium 200, 500mb ram, 30 gig or so) that is dying on me at random times > early in the morning!! > > (quite annoying). > > I've gathered the following from the logs: > > Jun 5 06:43:42 godzilla kernel: EXT2-fs error (device 16:00): ext2_readdir: > bad entry in directory #126 > 28: rec_len % 4 != 0 - offset=0, inode=3326860705, rec_len=34410, > name_len=31708 > Jun 5 06:43:42 godzilla kernel: Remounting filesystem read-only > Jun 5 06:43:42 godzilla kernel: EXT2-fs error (device 16:00): > ext2_find_entry: bad entry in directory # > 12628: rec_len % 4 != 0 - offset=0, inode=3326860705, rec_len=34410, > name_len=31708 > Jun 5 06:43:42 godzilla kernel: Remounting filesystem read-only > Jun 5 07:02:50 godzilla kernel: hdc: read_intr: status=0x59 { DriveReady > SeekComplete DataRequest Error > } > Jun 5 07:02:50 godzilla kernel: hdc: read_intr: error=0x40 > UncorrectableError }, LBAsect=4512574, sec > tor=4512574 > Jun 5 07:02:50 godzilla kernel: end_request: I/O error, dev 16:00, sector > 4512574 > Jun 5 07:03:00 godzilla kernel: hdc: irq timeout: status=0xd0 { Busy } > Jun 5 07:03:01 godzilla kernel: ide1: reset: success > Jun 5 07:03:08 godzilla kernel: hdc: read_intr: status=0x59 { DriveReady > SeekComplete DataRequest Error > } > Jun 5 07:03:08 godzilla kernel: hdc: read_intr: error=0x40 > UncorrectableError }, LBAsect=4512720, sec > tor=4512720 > Jun 5 07:03:08 godzilla kernel: end_request: I/O error, dev 16:00, sector > 4512720 > Jun 5 07:03:19 godzilla kernel: hdc: irq timeout: status=0xd0 { Busy } > Jun 5 07:03:21 godzilla kernel: ide1: reset: success > Jun 5 07:03:31 godzilla kernel: hdc: irq timeout: status=0xd0 { Busy } > Jun 5 07:03:35 godzilla kernel: ide1: reset: success > Jun 5 07:03:43 godzilla kernel: hdc: read_intr: status=0x59 { DriveReady > SeekComplete DataRequest Error > } > Jun 5 07:03:43 godzilla kernel: hdc: read_intr: error=0x01 > AddrMarkNotFound }, LBAsect=4512574, secto > r=4512574 > > When the techie on call at that time put a monitor on the box he saw > "Couldn't get free page..." all the way down the screen and couldn't get a > prompt. (Forcing us to hard reboot the system). >
Looks like /dev/hdc is in trouble. While this could be caused by other stuff on the same IDE cable or on your PCI bus, if all of the messages are pointing at the same device (they seem to be) it's the most likely source. Most likely causes: - /dev/hdc is dying. - Overheating. Especially if you have several drives, with maybe not as much air space between them as they might prefer, and have only a CPU and PSU fan. Especially if there's an extended period of disk activity (say, some 7am cron jobs) before things come unstuck. - Bad or poorly fitted IDE or power cable. I'd have a close look at the cooling for /dev/hdc and maybe give it more room or install a drive fan, but I'd probably also source a replacement drive in case. John P. -- [EMAIL PROTECTED] [EMAIL PROTECTED] http://www.mdt.net.au/~john Debian Linux admin & support:technical services