Every few days, I get the kernel error "hdX: lost interrupt" where X is usually c or g.
I'm having a hard time tracking down any systematic way of troubleshooting this problem. hdg is a brand new drive and ran for a couple of weeks in another system without a blip, so I don't think it is a problem with the drive itself. There are also no SMART errors appearing on any drives. I have replaced the ribbon cable connecting the drive to the controller. hdc and hdg, which both occasionally get lost interrupts, are on different controllers--and, in fact, on diffferent sorts of controllers. One is a VIA vt8235 IDE UDMA133, the other is a RAID Controller Triones Technologies HPT366/368/370/370A/372. I was using Debian stock kernel 2.6.8-2-k7; now I'm using a custom built vanilla 2.6.15.4. I haven't figured out if there is a real statistical difference in the number of errors with each--I may be getting them slightly more frequently with 2.6.15.4 but I don't have enough data points to be sure. I also *seemed* to be getting them more frequently when I had a UPS installed. Since I've taken the UPS out and connected the CPU directly to a power socket, they seem to be rarer and are not accompanied by any dma timeout errors, but again I'm not certain this is statistically significant. /proc/interrupts says: CPU0 0: 32453965 XT-PIC timer 1: 16 XT-PIC i8042 2: 0 XT-PIC cascade 5: 0 XT-PIC uhci_hcd:usb2 8: 4 XT-PIC rtc 10: 3554483 XT-PIC ide2, ide3, uhci_hcd:usb3 11: 9589616 XT-PIC uhci_hcd:usb1, eth0, eth1 12: 0 XT-PIC ehci_hcd:usb4 14: 2235942 XT-PIC ide0 15: 1836402 XT-PIC ide1 NMI: 0 LOC: 32454287 ERR: 12990 MIS: 0 /proc/ioports: 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-006f : keyboard 0070-0077 : rtc 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial 0cf8-0cff : PCI conf1 4000-407f : 0000:00:11.0 5000-500f : 0000:00:11.0 c000-c0ff : 0000:00:0c.0 c000-c0ff : r8169 c400-c4ff : 0000:00:0e.0 c800-c807 : 0000:00:0f.0 c800-c807 : ide2 cc00-cc03 : 0000:00:0f.0 cc02-cc02 : ide2 d000-d007 : 0000:00:0f.0 d000-d007 : ide3 d400-d403 : 0000:00:0f.0 d402-d402 : ide3 d800-d8ff : 0000:00:0f.0 d800-d807 : ide2 d808-d80f : ide3 d810-d8ff : HPT372 dc00-dc1f : 0000:00:10.0 dc00-dc1f : uhci_hcd e000-e01f : 0000:00:10.1 e000-e01f : uhci_hcd e400-e41f : 0000:00:10.2 e400-e41f : uhci_hcd e800-e80f : 0000:00:11.1 e800-e807 : ide0 e808-e80f : ide1 ec00-ecff : 0000:00:12.0 ec00-ecff : via-rhine I have one drive from each controller in a software RAID-5: hda, hdc, hde, and hdh. Any suggestions for how to go about diagnosing the problem? -- Adam Rosi-Kessel http://adam.rosi-kessel.org -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]