Hi. Cheers for your reply.
Unfortunately there's too much data for snapshots to be practical. The data set will be at least 400GB initially, and the offsite node will be on a 20Mbit leased line. However I don't need the consistency level to be quorum for read/writes in the production cluster, so am I right in still assuming that a replication factor of 2 in a three node cluster allows for one node to die without data loss? If that's the case, I still don't understand how to ensure that the offsite node will get a copy of the whole data set. I've read through the O'Reilly book, and that doesn't seem to address this scenario (unless I still don't get the Cassandra basics at a fundamental level). Does anyone know any tutorials/examples of such a set-up that would help me out? Cheers, Brian On Tue, 2011-03-29 at 21:56 +1100, aaron morton wrote: > Be aware that at RF 2 the Quorum is 2, so you cannot afford to lose a > replica when working at Quorum. 3 is really the starting point if you > want some redundancy. > > > If you want to get your data offsite how about doing snapshots and > moving them off > site http://wiki.apache.org/cassandra/Operations#Consistent_backups > > > The guide from Data Stax will give you a warm failover site, which > sounds a bit more than what you need. > > > Hope that helps. > Aaron > > > On 28 Mar 2011, at 22:47, Brian Lycett wrote: > > > Hello. > > > > I'm setting up a cluster that has three nodes in our production > > rack. > > My intention is to have a replication factor of two for this. > > For disaster recovery purposes, I need to have another node (or > > two?) > > off-site. > > > > The off-site node is entirely for the purpose of having an offsite > > backup of the data - no clients will connect to it. > > > > My question is, is it possible to configure Cassandra so that the > > offsite node will have a full copy of the data set? > > That is, somehow guarantee that a replica of all data will be > > written to > > it, but without having to resort to an ALL consistency level for > > writes? > > Although the offsite node will on a 20Mbit leased line, I'd rather > > not > > have the risk that the link goes down and breaks the cluster. > > > > I've seen this suggestion here: > > http://www.datastax.com/docs/0.7/operations/datacenter#disaster > > but that configuration is vulnerable to the link breaking, and uses > > four > > nodes in the offsite location. > > > > > > Regards, > > > > Brian > > > > > > > >