2011-06-11 19:15, Pasi Kärkkäinen пишет:
On Sat, Jun 11, 2011 at 08:35:19AM -0500, Edmund White wrote:
I've had two incidents where performance tanked suddenly, leaving the VM
guests and Nexenta SSH/Web consoles inaccessible and requiring a full
reboot of the array to restore functionality. In both cases, it was the
Intel X-25M L2ARC SSD that failed or was "offlined". NexentaStor failed to
alert me on the cache failure, however the general ZFS FMA alert was
visible on the (unresponsive) console screen.
The "zpool status" output showed:
cache
c6t5001517959467B45d0 FAULTED 2 542 0 too many errors
This did not trigger any alerts from within Nexenta.
I was under the impression that an L2ARC failure would not impact the
system. But in this case, it was the culprit. I've never seen any
recommendations to RAID L2ARC for resiliency. Removing the bad SSD
entirely from the server got me back running, but I'm concerned about the
impact of the device failure and the lack of notification from
NexentaStor.
IIRC recently there was discussion on this list about firmware bug
on the Intel X25 SSDs causing them to fail under high disk IO with "reset
storms".
Even if so, this does not forgive ZFS hanging - especially
if it detected the drive failure, and especially if this drive
is not required for redundant operation.
I've seen similar bad behaviour on my oi_148a box when
I tested USB flash devices as L2ARC caches and
occasionally they died by slightly moving out of the
USB socket due to vibration or whatever reason ;)
Similarly, this oi_148a box hung upon loss of SATA
connection to a drive in the raidz2 disk set due to
unreliable cable connectors, while it should have
stalled IOs to that pool but otherwise the system
should have remained remain responsive (tested
failmode=continue and failmode=wait on different
occasions).
So I can relate - these things happen, they do annoy,
and I hope they will be fixed sometime soon so that
ZFS matches its docs and promises ;)
//Jim Klimov
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss