>>>>> "bh" == Brandon High <bh...@freaks.com> writes:
bh> Recent versions no longer support enabling TLER or ERC. To bh> the best of my knowledge, Samsung and Hitachi drives all bh> support CCTL, which is yet another name for the same thing. once again, I have to ask, has anyone actually found these features to make a verified positive difference with ZFS? Some of those things you cannot even set on Solaris because the channel to the drive with a LSI controller isn't sufficiently transparent to support smartctl, and the settings don't survive reboots. Brandon have you actually set it yourself, or are you just aggregating forum discussion? The experience so far that I've read here has been: * if a drive goes bad completely + zfs will mark the drive unavailable after a delay that depends on the controller you're using, but with lengths like 60 seconds, 180 seconds, 2 hours, or forever. The delay is not sane or reasonable with all controllers, and even if redundancy is available ZFS will patiently wait for the controller. The delay depends on the controller driver. It's part of the Solaris code. best case zpool will freeze until the delay is up, but there are application timeouts and iSCSI initiator-target timeouts, too---getting the equivalent of an NFS hard mount is hard these days (even with NFS, in some people's experiences). + the delay is different if the system's running when the drive fails, or if it's trying to boot up. For example iSCSI will ``patiently wait'' forever for a drive to appear while booting up, but will notice after 180 seconds while running. + because the disk is compeltely bad, TLER, ERC, CCTL, whatever you call it, doesn't apply. The drive might not answer commands ever, at all. The timer is not in the drive: the drive is bad starting now, continuing forever. * if a drive goes partially bad (large and increasing numbers of latent sector errors, which for me happens more often than bad-completely): + the zpool becomes unusably slow + it stays unusably slow until you use 'iostat' or 'fmdump' to find the marginal drive and offline it + TLER, ERC, CCTL makes the slowness factor 7ms : 7000ms vs 7ms : 30000ms. In other words, it's unusably slow with or without the feature. AFAICT the feature is useful as a workaround for buggy RAID card firmware and nothing else. It's a cost differentiator, and you're swallowing it hook, line and sinker. If you know otherwise please reinform me, but the discussion here so far doesn't match what I've learned about ZFS and Solaris exception handling. That said, to reword Don Marti, ``uninformed Western Digital bashing is better than no Western Digital bashing at all.''
pgpFMSCuYt2qE.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss