[EMAIL PROTECTED] wrote: > War wounds? Could you please expand on the why a bit more?
- ZFS is not aware of AVS. On the secondary node, you'll always have to force the `zfs import` due to the unnoticed changes of metadata (zpool in use). No mechanism to prevent data loss exists, e.g. zpools can be imported when the replicator is *not* in logging mode. - AVS is not ZFS aware. For instance, if ZFS resilves a mirrored disk, e.g. after replacing a drive, the complete disk is sent over the network to the secondary node, even though the replicated data on the secondary is intact. That's a lot of fun with today's disk sizes of 750 GB and 1 TB drives, resulting in usually 10+ hours without real redundancy (customers who use Thumpers to store important data usually don't have the budget to connect their data centers with 10 Gbit/s, so expect 10+ hours *per disk*). - ZFS & AVS & X4500 leads to a bad error handling. The Zpool may not be imported on the secondary node during the replication. The X4500 does not have a RAID controller which signals (and handles) drive faults. Drive failures on the secondary node may happen unnoticed until the primary nodes goes down and you want to import the zpool on the secondary node with the broken drive. Since ZFS doesn't offer a recovery mechanism like fsck, data loss of up to 20 TB may occur. If you use AVS with ZFS, make sure that you have a storage which handles drive failures without OS interaction. - 5 hours for scrubbing a 1 TB drive. If you're lucky. Up to 48 drives in total. - An X4500 has no battery buffered write cache. ZFS uses the server's RAM as a cache, 15 GB+. I don't want to find out how much time a resilver over the network after a power outage may take (a full reverse replication would take up to 2 weeks and is no valid option in a serious production environment). But the underlying question I asked myself is why I should I want to replicate data in such an expensive way, when I think the 48 TB data itself are not important enough to be protected by a battery? - I gave AVS a set of 6 drives just for the bitmaps (using SVM soft partitions). Weren't enough, the replication was still very slow, probably because of an insane amount of head movements, and scales badly. Putting the bitmap of a drive on the drive itself (if I remember correctly, this is recommended in one of the most referenced howto blog articles) is a bad idea. Always use ZFS on whole disks, if performance and caching matters to you. - AVS seems to require an additional shared storage when building failover clusters with 48 TB of internal storage. That may be hard to explain to the customer. But I'm not 100% sure about this, because I just didn't find a way, I didn't ask on a mailing list for help. If you want a fail-over solution for important data, use the external JBODs. Use AVS only to mirror complete clusters, don't use it to replicate single boxes with local drives. And, in case OpenSolaris is not an option for you due to your company policies or support contracts, building a real cluster also A LOT cheaper. -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 1&1 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Oliver Mauss, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss