Antonio S. Cofiño wrote: > [...] > The system is a supermicro motherboard X8DTH-6F in a 4U chassis > (SC847E1-R1400LPB) and an external SAS2 JBOD (SC847E16-RJBOD1). > It makes a system with a total of 4 backplanes (2x SAS + 2x SAS2) > each of them connected to a 4 different HBA (2x LSI 3081E-R (1068 > chip) + 2x LSI SAS9200-8e (2008 chip)). > This system is has a total of 81 disk (2x SAS (SEAGATE ST3146356SS) > + 34 SATA3 (Hitachi HDS722020ALA330) + 45 SATA6 (Hitachi HDS723020BLA642))
> The issue arise when one of the disk starts to fail making long time > accesses. After some time (minutes, but I'm not sure) all the disks, > connected to the same HBA, start to report errors. This situation > produce a general failure on the ZFS making the whole POOL unavailable. > [...] Have been there and gave up at the end[1]. Could reproduce (even though it took a bit longer) under most Linux versions (incl. using latest LSI drivers) and LSI 3081E-R HBA. Is it just mpt causing the errors or also mpt_sas? In a lab environment the LSI 9200 HBA behaved better - I/O only dropped shortly and then continued on the other disks without generating errors. Had a lengthy Oracle case on this, but all proposed "workarounds" did not worked for me at all, which had been (some also from other forums) - disabling NCQ - allow-bus-device-reset=0; to /kernel/drv/sd.conf - set zfs:zfs_vdev_max_pending=1 - set mpt:mpt_enable_msi=0 - keep usage below 90% - no fmservices running and did temporarily did fmadm unload disk-transport or other disk access stuff (smartd?) - tried changing retries-timeout via sd-conf for the disks without any success and ended it doing via mdb At the end I knew the bad sector of the "bad" disk and by simply dd this sector once or twice to /dev/zero I could easily bring down the system/pool without any load on the disk system. General consensus from various people: don't use SATA drives on SAS back- planes. Some SATA drives might work better, but there seems to be no guarantee. And even for SAS-SAS, try to avoid SAS1 backplanes. Markus [1] Search for "What's wrong with LSI 3081 (1068) + expander + (bad) SATA disk?" -- KPN International Darmstädter Landstrasse 184 | 60598 Frankfurt | Germany [T] +49 (0)69 96874-298 | [F] -289 | [M] +49 (0)178 5352346 [E] <markus.we...@kpn.de> | [W] www.kpn.de KPN International ist ein eingetragenes Markenzeichen der KPN EuroRings B.V. KPN Eurorings B.V. | Niederlassung Frankfurt am Main Amtsgericht Frankfurt HRB56874 | USt.IdNr. DE 225602449 Geschäftsführer Jacobus Snijder & Louis Rustenhoven _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss