Hi Todd, Having finally gotten the time to read through this entire thread, I think Ralf said it best. ZFS can provide data integrity, but you're reliant on hardware and drivers for data availability.
In this case either your SATA controller, or the drivers for it don't cope at all well with a device going offline, so what you need is a SATA card that can handle that. Provided you have a controller that can cope with the disk errors, it should be able to return the appropriate status information to ZFS, which will in turn ensure your data is ok. The technique obviously works or Sun's x4500 servers wouldn't be doing anywhere near as well as they are. The problem we all seem to be having is finding white box hardware that supports it. I suspect your best bet would be to pick up a SAS controller based on the LSI chipsets used in the new x4540 server. There's been a fair bit of discussion here on these, and while there's a limitation in that you will have to manually keep track of drive names, I would expect it to handle disk failures (and pulling disks) much better, but you would probably be well advised asking the folks on the forums running those SAS controllers whether they've been able to pull disks sucessfully. I think the solution you need is definately to get a better disk controller, and your choice is either a plain SAS controller, or a raid controller that can present individual disks in pass through mode since they *definately* are designed to handle failures. Ross This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss