Thank you for your feedback. I'll speak to the dev guys and come up with something appropriate. On 26 Mar 2013 17:51, "aaron morton" <aa...@thelastpickle.com> wrote:
> Assume you have four nodes and a snapshot is taken. The following day if > a node goes down and data is corrupt through user error then how do you use > the previouus nights snapshots? > > Not sure what is corrupt, the snapshot/backup or the data is incorrect > through application error. > > Would you replace the faulty node first and then restore last nights > snapshot? What happens if you don't have a replacement node? You won't be > able to restore last nights snapshot. > > You would need to stop the entire cluster, and restore the snapshots on > all nodes. > If you restored the snapshot on just one node, new or old HW, it would > have some data with an older timestamp than the other nodes. Cassandra > would see this as an inconsistency, that the restored node missed some > writes, and resolve the consistency be the most recent values. > > However if a virtual datacenter consisting of a backup node is used then > the backup node could be used regardless of the number of nodes in the > datacentre. > > > It depends on the failure scenario and what you are trying to protect > against. > > If you have 4 nodes and one node fails the best thing to do is start a new > node and let cassandra stream the data from the other nodes. The new node > could have the same token as the previous failed node. So long as the > /var/lib/cassandra/data/system dir is empty (and the node is not a seed) it > will join the cluster and ask the others for data. > > If you want to ensure availability then consider bigger clusters, e.g. 6 > nodes with rf 3 allows you to lose up to 2 nodes and stay up. Or a higher > RF. (see http://thelastpickle.com/2011/06/13/Down-For-Me/) > > It's tricky to protect agains application error creating bad data using > just backups. You may need to look at how you can replay events in your > system and consider which parts of your data model should be directly > mutates and which should be indirectly mutated by recording changes in > another part of the model. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 25/03/2013, at 8:19 AM, Jabbar Azam <aja...@gmail.com> wrote: > > Thanks Aaron. I have a hypothetical question. > > Assume you have four nodes and a snapshot is taken. The following day if > a node goes down and data is corrupt through user error then how do you use > the previouus nights snapshots? > > Would you replace the faulty node first and then restore last nights > snapshot? What happens if you don't have a replacement node? You won't be > able to restore last nights snapshot. > > However if a virtual datacenter consisting of a backup node is used then > the backup node could be used regardless of the number of nodes in the > datacentre. Would there be any disadvantages approach? Sorry for the > questions I want to understand all the options. > On 24 Mar 2013 17:45, "aaron morton" <aa...@thelastpickle.com> wrote: > >> There are advantages and disadvantages in both approaches. What are >> people doing in their production systems? >> >> Generally a mix of snapshots+rsync or https://github.com/synack/tablesnap to >> get things off node. >> >> Cheers >> >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 23/03/2013, at 4:37 AM, Jabbar Azam <aja...@gmail.com> wrote: >> >> Hello, >> >> I've been experimenting with cassandra for quite a while now. >> >> It's time for me to look at backups but I'm not sure what the best >> practice is. I want to be able to recover the data to a point in time >> before any user or software errors. >> >> We will have two datacentres with 4 servers and RF=3. >> >> Each datacentre will have at most 1.6 TB(includes replication, LZ4 >> compression, using test data) of data. That is ten years of data after >> which we will start purging. This amounts to about 400MB of data generation >> per day. >> >> I've read about users doing snapshots of individual nodes to S3(Netflix) >> and I've read about creating virtual datacentres ( >> http://www.datastax.com/dev/blog/multi-datacenter-replication) where >> each virtual datacentre contains a backup node. >> >> There are advantages and disadvantages in both approaches. What are >> people doing in their production systems? >> >> >> >> >> -- >> Thanks >> >> Jabbar Azam >> >> >> >