Re: What is your backup strategy for Cassandra?

John Wong Fri, 18 Sep 2015 17:03:07 -0700

On Fri, Sep 18, 2015 at 3:02 PM, Sanjay Baronia <
sanjay.baro...@triliodata.com> wrote:

>
> Will be at the Cassandra summit next week if any of you would like a demo.
>
>
>

Sanjay, is Trilio Data's work private? Unfortunately I will not attend the
Summit, but maybe Trilio can also talk about this in, say, a Cassandra
Planet blog post? I'd like to see a demo or get a little more technical. If
open source would be cool.

I didn't implement our solution, but the current solution is based on full
snapshot copies to a remote server for storage using rsync (only transfers
what is needed). On our remote server we have a complete backup of every
hour, so if you cd into the data directory you can get every node's exact
moment-in-time data like you are browsing on the actual nodes.

We are an AWS shop so we can further optimize our cost by using EBS
snapshot so the volume can reduce (currently we provisioned 4000GB which is
too much). Anyway, s3 we tried, and is an okay solution. The bad thing is
performance plus ability to quickly go back in time. With EBS I can create
a dozen volumes from the same snapshot, attach each to my each of my node,
and cp -r files over.

John

>
> From: Maciek Sakrejda <mac...@heroku.com>
> Reply-To: Cassandra Maillist <user@cassandra.apache.org>
> Date: Friday, September 18, 2015 at 2:09 PM
> To: Cassandra Maillist <user@cassandra.apache.org>
> Subject: Re: What is your backup strategy for Cassandra?
>
> On Thu, Sep 17, 2015 at 7:46 PM, Marc Tamsky <mtam...@gmail.com> wrote:
>
>> This seems like an apt time to quote [1]:
>>
>> > Remember that you get 1 point for making a backup and 10,000 points for
>> restoring one.
>>
>> Restoring from backups is my goal.
>>
>> The commonly recommended tools (tablesnap, cassandra_snapshotter) all
>> seem to leave the restore operation as a pretty complicated exercise for
>> the operator.
>>
>> Do any include a working way to restore, on a different host, all of node
>> X's data from backups to the correct directories, such that the restored
>> files are in the proper places and the node restart method [2] "just works"?
>>
>
> As someone getting started with Cassandra, I'm very much interested in
> this as well. It seems that for the most part, folks seem to rely on
> replication and node replacement to recover from failures, and perhaps this
> is a testament for how well this works, but as long as we're hauling out
> aphorisms, "RAID is not a backup" seems to (partially) apply here too.
>
> I'd love to hear more about how the community does restores, too. This
> isn't complaining about shoddy tooling: this is trying to understand--and
> hopefully, in time, improve--the status quo re: disaster recovery. E.g.,
> given that tableslurp operates on a single table at a time, do people
> normally just restore single tables? Is that used when there's filesystem
> or disk corruption? Bugs? Other issues? Looking forward to learning more.
>
> Thanks,
> Maciek
>

Re: What is your backup strategy for Cassandra?

Reply via email to