> b) do people skip backups altogether except for huge outages and just let > rebooted server instances come up empty to repopulate via C*? This one. Bootstrapping a new node into the cluster has a small impact on the existing nodes and the new nodes to have all the data they need when the finish the process.
Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/05/2013, at 3:17 AM, Janne Jalkanen <janne.jalka...@ecyrd.com> wrote: > On May 16, 2013, at 17:05 , Brian Tarbox <tar...@cabotresearch.com> wrote: > >> An alternative that we had explored for a while was to do a two stage backup: >> 1) copy a C* snapshot from the ephemeral drive to an EBS drive >> 2) do an EBS snapshot to S3. >> >> The idea being that EBS is quite reliable, S3 is still the emergency backup >> and copying back from EBS to ephemeral is likely much faster than the 15 >> MB/sec we get from S3. > > Yup, this is what we do. We use rsync with --bwlimit=4000 to copy the > snapshots from the eph drive to EBS; this is intentionally very low so that > the backup process does not take eat our I/O. This is on m1.xlarge > instances; YMMV so measure :). EBS drives are then snapshot with > ec2-consistent-snapshot and then old snapshots expired using > ec2-expire-snapshots (I believe these scripts are from Alestic). > > /Janne >