We've been evaluating ZFS as a possible enterprise file system for our campus. Initially, we were considering one large cluster, but it doesn't look like that will scale to meet our needs. So, now we are thinking about breaking our storage across multiple servers, probably three.
However, I don't necessarily want to incur the expense and hassle of maintaining three clusters, but think I might have three standalone servers instead. If one of them happens to break, we're only down 1/3 Of our files, not all of them. Given our budget, that's probably an acceptable compromise. On the other hand, it would be nice to have some level of redundancy, so I'm toying with the idea of having each server be primary for some amount of storage, and secondary for a different set of storage. Each server would use zfs send to replicate snapshots to its backup server. I've read a number of threads and blog posts discussing zfs send/receive and its applicability is such an implementation, but I'm curious if anyone has actually done something like that in practice, and if so how well it worked. What authentication/authorization was used to transfer the zfs snapshots between servers? I'm thinking about using ssh with public-key authentication over an internal private network the servers are connected to with different ethernet interfaces than the ones facing the world and actually serving files. Does zfs send/receive have to be done with root privileges, or can RBAC or some other mechanism be used so a lower privileged account could be used? In the various threads I read about this type of failover, there was some issue about marking the filesystems readonly on the slave, or else changes would cause snapshots to fail? Supposedly there was some feature added to zfs receive to rectify this problem, did that make it into S10U4, or is that still only in the development version? Did you have automatic or manual failover? I'm thinking about having a manual failover process, if the process were automatic given the replication is only one way if a failover happened, and the secondary server started providing service, updates would happen there that would not be on the primary server if it suddenly came back to life and took over again. How did you implement the failover at the network level? DNS change? Virtual IP address switched from one server to the other? Thanks much for any feedback... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | [EMAIL PROTECTED] California State Polytechnic University | Pomona CA 91768 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss