Lida Horn wrote:
Richard Elling wrote:
There are known issues with the Marvell drivers in X4500s. You will
want to pay attention to the release notes, SRDBs, InfoDocs, and SunAlerts
for the platform.
http://sunsolve.sun.com/handbook_pub/validateUser.do?target=Systems/SunFireX4500/SunFireX4500
You will want to especially pay attention to SunAlert 201289
http://sunsolve.sun.com/search/document.do?assetkey=1-66-201289-1
If you run into these or other problems which are not already described
in the above documents, please log a service call which will get you
into the folks who track the platform problems specifically and know
about patches in the pipeline.
-- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Although I am not in the SATA group any longer, I have in the past
tested hot plugging and failures
of SATA disks with x4500s, Marvell plug in cards and SuperMicro plug in
cards. It has worked
in the past on all of these platforms. Having said that there are
things that you might be hitting
or might try.
1) The default behavior when a disk is removed and then re-inserted is
to leave the disk unconfigured.
The operator must issue a cfgadm -c configure sata<x>/<y> to bring
the newly plugged in disk on-line.
There was some work being done to make this automatic, but I am not
currently aware of the state of
that work.
As of build 94, it does not automatically bring the disk online.
I replaced a failed disk on an x4500 today running Nevada build 94, and
still
had to manually issue
# cfgadm -c configure sata1/3
# zpool replace tank cxt2d0
then wait 7 hours for resilver.
But the above is correct and expected. They simply have not automated
that yet. Apparently.
Neal
2) There were bugs related to disk drive errors that have been addressed
(several months ago). If you have old
code you could be hitting one or more of those issues.
3) I think there was a change in the sata generic module with respect to
when it declares a failed disk as "off-line".
You might want to check if you are hitting a problem with that.
4) There are a significant number of bugs in ZFS that can cause hangs.
Most have been addressed with recent patches.
Make sure you have all the patches.
If you use the raw disk (i.e. no ZFS involvement) doing something like
dd bs=128k if=/dev/rdsk/c<x>t<y>d0p0 of=/dev/null
and then try pulling out the disk. The dd should return with an I/O
error virtually immediately. If it doesn't then
ZFS is probably not the issue. You can also issue the command "cfgadm"
and see what it lists as the state(s) of the
various disks.
Hope that helps,
Lida Horn
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss