I found this thread after fighting the same problem in Nexenta which uses the 
OpenSolaris kernel from b104.  Thankfully, I think I have (for the moment) 
solved my problem.

Background:

I have an LSI 3081e-R (1068E based) adapter which experiences the same 
disconnected command timeout error under relatively light load.  This card 
connects to a Supermicro chassis using 2 MiniSAS cables to redundant expanders 
that are attached to 18 SAS drives.  The card ran the latest IT firmware 
(1.29?).

This server is a new install, and even installing from the CD to two disks in a 
mirrored ZFS root would randomly cause the disconnect error.  The system 
remained unresponsive until after a reboot.

I tried the workarounds mentioned in this thread, namely using "set 
mpt:mpt_enable_msi = 0" and "set xpv_psm:xen_support_msi = -1" in /etc/system.  
Once I added those lines, the system never really became unresponsive, however 
there were partial read and partial write messages that littered dmesg.  At one 
point there appeared to be a disconnect error ( can not confirm ) that the 
system recovered from.

Eventually, I became desperate and flashed the IR (Integrated Raid) firmware 
over the top of the IT firmware.  Since then, I have had no errors in dmesg of 
any kind.

I even removed the workarounds from /etc/system and still have had no issues.  
The mpt driver is exceptionally quiet now.

I'm interested to know if anyone who has a 1068E based card is having these 
problems using the IR firmware, or if they all seem to be IT (initiator target) 
related.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to