It all depends on what sort of disasters you are planning for and how valuable 
your data is.

The cheap and cheerful approach is to snapshot and then rsync / copy off the 
node. Or you can do something like https://github.com/synack/tablesnap . 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22/09/2011, at 2:27 AM, Jonathan Ellis wrote:

> you can snapshot individual CFs.  sstable2json is primarily for debugging.
> 
> On Wed, Sep 21, 2011 at 9:17 AM, David McNelis
> <dmcne...@agentisenergy.com> wrote:
>> When planning a DR strategy, which option is going to, most consistently,
>> take the least amount of disk space, be fastest to recover from, least
>> complicated recovery, ect?
>> I've read through the Operations documents and my take is this so far.  If I
>> have specific column families I want to snapshot across the cluster, then
>> sstables2json would make the most sense.  However, if I want to back  up an
>> individual node(s), so that I can better and more quickly recover from a
>> node failure then snapshots would make more sense?
>> Regularly backing up the data on a large cluster with a high replication
>> factor is redundant, but in a situation where you have an RF <= 2, and are
>> located in a single rack / datacenter, then it might make sense to implement
>> something like this to backup and store data offsite, and I'm trying to
>> figure out  what a good, viable, and storage efficient plan would look like.
>> --
>> David McNelis
>> Lead Software Engineer
>> Agentis Energy
>> www.agentisenergy.com
>> o: 630.359.6395
>> c: 219.384.5143
>> A Smart Grid technology company focused on helping consumers of energy
>> control an often under-managed resource.
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com

Reply via email to