Lida Horn wrote:
Richard Elling wrote:
There are known issues with the Marvell drivers in X4500s.  You will
want to pay attention to the release notes, SRDBs, InfoDocs, and SunAlerts
for the platform.
http://sunsolve.sun.com/handbook_pub/validateUser.do?target=Systems/SunFireX4500/SunFireX4500

You will want to especially pay attention to SunAlert 201289
http://sunsolve.sun.com/search/document.do?assetkey=1-66-201289-1

If you run into these or other problems which are not already described
in the above documents, please log a service call which will get you
into the folks who track the platform problems specifically and know
about patches in the pipeline.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Although I am not in the SATA group any longer, I have in the past tested hot plugging and failures of SATA disks with x4500s, Marvell plug in cards and SuperMicro plug in cards. It has worked in the past on all of these platforms. Having said that there are things that you might be hitting
or might try.

1) The default behavior when a disk is removed and then re-inserted is to leave the disk unconfigured. The operator must issue a cfgadm -c configure sata<x>/<y> to bring the newly plugged in disk on-line. There was some work being done to make this automatic, but I am not currently aware of the state of
    that work.

As of build 94, it does not automatically bring the disk online.
I replaced a failed disk on an x4500 today running Nevada build 94, and still
had to manually issue

# cfgadm -c configure sata1/3
# zpool replace tank cxt2d0

then wait 7 hours for resilver.
But the above is correct and expected. They simply have not automated that yet. Apparently.

Neal


2) There were bugs related to disk drive errors that have been addressed (several months ago). If you have old
     code you could be hitting one or more of those issues.

3) I think there was a change in the sata generic module with respect to when it declares a failed disk as "off-line".
    You might want to check if you are hitting a problem with that.

4) There are a significant number of bugs in ZFS that can cause hangs. Most have been addressed with recent patches.
    Make sure you have all the patches.

If you use the raw disk (i.e. no ZFS involvement) doing something like dd bs=128k if=/dev/rdsk/c<x>t<y>d0p0 of=/dev/null and then try pulling out the disk. The dd should return with an I/O error virtually immediately. If it doesn't then ZFS is probably not the issue. You can also issue the command "cfgadm" and see what it lists as the state(s) of the
various disks.

Hope that helps,
Lida Horn
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to