> Assume you have four nodes and a snapshot is taken. The following day if a > node goes down and data is corrupt through user error then how do you use the > previouus nights snapshots? > Not sure what is corrupt, the snapshot/backup or the data is incorrect through application error.
> Would you replace the faulty node first and then restore last nights > snapshot? What happens if you don't have a replacement node? You won't be > able to restore last nights snapshot. > You would need to stop the entire cluster, and restore the snapshots on all nodes. If you restored the snapshot on just one node, new or old HW, it would have some data with an older timestamp than the other nodes. Cassandra would see this as an inconsistency, that the restored node missed some writes, and resolve the consistency be the most recent values. > However if a virtual datacenter consisting of a backup node is used then the > backup node could be used regardless of the number of nodes in the datacentre. > It depends on the failure scenario and what you are trying to protect against. If you have 4 nodes and one node fails the best thing to do is start a new node and let cassandra stream the data from the other nodes. The new node could have the same token as the previous failed node. So long as the /var/lib/cassandra/data/system dir is empty (and the node is not a seed) it will join the cluster and ask the others for data. If you want to ensure availability then consider bigger clusters, e.g. 6 nodes with rf 3 allows you to lose up to 2 nodes and stay up. Or a higher RF. (see http://thelastpickle.com/2011/06/13/Down-For-Me/) It's tricky to protect agains application error creating bad data using just backups. You may need to look at how you can replay events in your system and consider which parts of your data model should be directly mutates and which should be indirectly mutated by recording changes in another part of the model. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 25/03/2013, at 8:19 AM, Jabbar Azam <aja...@gmail.com> wrote: > Thanks Aaron. I have a hypothetical question. > > Assume you have four nodes and a snapshot is taken. The following day if a > node goes down and data is corrupt through user error then how do you use the > previouus nights snapshots? > > Would you replace the faulty node first and then restore last nights > snapshot? What happens if you don't have a replacement node? You won't be > able to restore last nights snapshot. > > However if a virtual datacenter consisting of a backup node is used then the > backup node could be used regardless of the number of nodes in the > datacentre. Would there be any disadvantages approach? Sorry for the > questions I want to understand all the options. > > On 24 Mar 2013 17:45, "aaron morton" <aa...@thelastpickle.com> wrote: >> There are advantages and disadvantages in both approaches. What are people >> doing in their production systems? > Generally a mix of snapshots+rsync or https://github.com/synack/tablesnap to > get things off node. > > Cheers > > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 23/03/2013, at 4:37 AM, Jabbar Azam <aja...@gmail.com> wrote: > >> Hello, >> >> I've been experimenting with cassandra for quite a while now. >> >> It's time for me to look at backups but I'm not sure what the best practice >> is. I want to be able to recover the data to a point in time before any user >> or software errors. >> >> We will have two datacentres with 4 servers and RF=3. >> >> Each datacentre will have at most 1.6 TB(includes replication, LZ4 >> compression, using test data) of data. That is ten years of data after which >> we will start purging. This amounts to about 400MB of data generation per >> day. >> >> I've read about users doing snapshots of individual nodes to S3(Netflix) and >> I've read about creating virtual datacentres >> (http://www.datastax.com/dev/blog/multi-datacenter-replication) where each >> virtual datacentre contains a backup node. >> >> There are advantages and disadvantages in both approaches. What are people >> doing in their production systems? >> >> >> >> >> -- >> Thanks >> >> Jabbar Azam >