On 10/24/09 9:43 AM, Richard Elling wrote:
OK, here we see 4 I/Os pending outside of the host. The host has
sent them on and is waiting for them to return. This means they are
getting dropped either at the disk or somewhere between the disk
and the controller.
When this happens, the sd driver will time them out, try to clear
the fault by reset, and retry. In other words, the resets you see
are when the system tries to recover.
Since there are many disks with 4 stuck I/Os, I would lean towards
a common cause. What do these disks have in common? Firmware?
Do they share a SAS expander?
I saw this with my WD 500GB SATA disks (HDS725050KLA360) and LSI firmware
1.28.02.00 in IT mode, but I (almost?) always had exactly 1 "stuck" I/O. Note
that my disks were one per channel, no expanders. I have _not_ seen it since
replacing those disks. So my money is on a bug in the LSI firmware, the drive
firmware, the drive controller hardware, or some combination thereof.
Note that LSI has released firmware 1.29.00.00. Sadly I cannot find any
documentation on what has changed. Downloadable from LSI at
http://lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas3081e-r/index.html?remote=1&locale=EN
--
Carson
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss