Re: 7.2-RELEASE-p4, IO errors & RAID1 failure

2010-06-18 Thread Pieter de Boer
Hi Matthew, I'm running 7.2-RELEASE-p4 on an i386 HP server (ML G5) in RAID1 configuration. Very recently, I've seen IO errors such as: ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=20472527 reported and the RAID mirror is now offline. ad0: TIMEOUT - WRITE_DMA48 retrying (1 retry left)

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-16 Thread Pieter de Boer
Hi Jeremy, Anyway, if heavy disk/controller load appears to be causing these problems, you could have power-related issues. Possibly the combination of two disks + heavy I/O causes enough power draw that the ICH9 starts to behave oddly. Voltages which deviate too much can cause odd things to

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-15 Thread Pieter de Boer
Attached the SMART output of both disks I replaced about a month ago. It appears I replaced perfectly fine drives with the current disks with errors ;( One of the old disks is in a USB-enclosure now, so 'da0'. Let's send those attachments, then. -- Pieter smartctl 5.39 2009-12-09 r2995 [FreeBS

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-15 Thread Pieter de Boer
Hi, That could be caused by a multitude of other known things. For example, some Western Digital "Green" drives (including the Enterprise class ones) are known to perform head parking/offloading excessively, which could result in the drive spending more time doing that than actually serving ov

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-15 Thread Pieter de Boer
Hi there, what kind of disk I/O is going on. If actual I/O is very little, then something weird is going on with regards to the number of interrupts being seen on IRQ 23. mav@ might have some ideas, otherwise I'd recommend rebooting the machine and seeing if the number drops. If so, it may be

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-15 Thread Pieter de Boer
Hi Terry, I have a bunch of R300's here. From one that is using the on-board SATA and 2 drives in a gmirror setup (very similar to the OP) after 18 hours of uptime: [0:2] speedtest:~> vmstat -i interrupt total rate irq23: atapci0254116

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-15 Thread Pieter de Boer
Hi Jeremy, Lots to say about all of this. Thanks for your elaborate reply, it was very useful to see smartctl output explained a bit :) I still think there's something else in play beside disk failure. I've checked one of the drives I replaced earlier, but that one doesn't have any of the e

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-14 Thread Pieter de Boer
My question: does anyone have experience with FreeBSD on a Dell R300 or can anyone give me some help in trying to fix the timeouts? Could you please do the following: - Provide output from "vmstat -i" - Provide output from "dmesg | grep -i ata" - Install ports/sysutils/smartmontools (5.40 o

Re: Read / write timeouts on SATA disks connected to ICH9

2010-05-14 Thread Pieter de Boer
Adam Vande More wrote: May 5 03:01:37 aberdeen kernel: ad4: timeout waiting to issue command May 5 03:01:37 aberdeen kernel: ad4: error issuing WRITE_DMA48 command May 5 03:01:37 aberdeen kernel: GEOM_MIRROR: Request failed (error=5). ad4[WRITE(offset=200404975104, length=16384)] May 5 03:01

Read / write timeouts on SATA disks connected to ICH9

2010-05-14 Thread Pieter de Boer
Hi list, I'm running FreeBSD 8.0-RELEASE-p1 on a Dell R300 which has a ICH9 SATA controller on-board (do not have the RAID controller). The system has 2 disks in a gmirror setup. Every now and then, probably under some load, one of the disks gets read or write timeouts like: May 5 03:01:37