It all depends on what sort of disasters you are planning for and how valuable your data is.
The cheap and cheerful approach is to snapshot and then rsync / copy off the node. Or you can do something like https://github.com/synack/tablesnap . Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22/09/2011, at 2:27 AM, Jonathan Ellis wrote: > you can snapshot individual CFs. sstable2json is primarily for debugging. > > On Wed, Sep 21, 2011 at 9:17 AM, David McNelis > <dmcne...@agentisenergy.com> wrote: >> When planning a DR strategy, which option is going to, most consistently, >> take the least amount of disk space, be fastest to recover from, least >> complicated recovery, ect? >> I've read through the Operations documents and my take is this so far. If I >> have specific column families I want to snapshot across the cluster, then >> sstables2json would make the most sense. However, if I want to back up an >> individual node(s), so that I can better and more quickly recover from a >> node failure then snapshots would make more sense? >> Regularly backing up the data on a large cluster with a high replication >> factor is redundant, but in a situation where you have an RF <= 2, and are >> located in a single rack / datacenter, then it might make sense to implement >> something like this to backup and store data offsite, and I'm trying to >> figure out what a good, viable, and storage efficient plan would look like. >> -- >> David McNelis >> Lead Software Engineer >> Agentis Energy >> www.agentisenergy.com >> o: 630.359.6395 >> c: 219.384.5143 >> A Smart Grid technology company focused on helping consumers of energy >> control an often under-managed resource. >> >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com