Late answer; You can find my backup script here: https://gist.github.com/JensRantil/a8150e998250edfcd1a3
Basically you need to set S3_BUCKET, PGP_KEY_RECIPIENT, configure s3cmd (using s3cmd --configure) and then issue `./backup-keyspace.sh your-keyspace` to backup it to S3. We run the script is run periodically on every node. Regarding “s3cmd --configure”, I executed it once and then copied “~/.s3cfg” to all nodes. Like I said, there’s lots of love that can be put into a backup system. Note that the script has the following limitations: * It does not checksum the files. However s3cmd website states that it by default compares MD5 and file size on upload. * It does not do purging of files on S3 (which you could configure using “Object Lifecycles”). * It does not warn you that a backup fails. Check your logs periodically. * It does not do any advanced logging. Make sure to pipe the output to a file or the `syslog` utility. * It does not do continuous/point-in-time backup. That said, it does its job for us for now. Feel free to propose improvements! Cheers, Jens ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter On Fri, Nov 21, 2014 at 7:36 PM, William Arbaugh <w...@cs.umd.edu> wrote: > Jens, > I'd be interested in seeing your script. We've been thinking of doing exactly > that but uploading to Glacier instead. > Thanks, Bill >> On Nov 21, 2014, at 11:40 AM, Jens Rantil <jens.ran...@tink.se> wrote: >> >> > The main purpose is to protect us from human errors (eg. unexpected >> > manipulations: delete, drop tables, …). >> >> If that is the main purpose, having "auto_snapshot: true” in cassandra.yaml >> will be enough to protect you. >> >> Regarding backup, I have a small script that creates a named snapshot and >> for each sstable; encrypts, uploads to S3 and deletes the snapshotted >> sstable. It took me an hour to write and roll out to all our nodes. The >> whole process is currently logged, but eventually I will also send an e-mail >> if backup fails. >> >> ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: >> +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter >> >> >> On Tue, Nov 18, 2014 at 3:52 PM, Ngoc Minh VO <ngocminh...@bnpparibas.com> >> wrote: >> >> Hello all, >> >> >> >> >> >> >> We are looking for a solution to backup data in our C* cluster (v2.0.x, 16 >> nodes, 4 x 500GB SSD, RF = 6 over 2 datacenters). >> >> >> >> The main purpose is to protect us from human errors (eg. unexpected >> manipulations: delete, drop tables, …). >> >> >> >> >> >> >> We are thinking of: >> >> >> >> - Backup: add a 2TB HDD on each node for C* daily/weekly snapshots. >> >> >> >> - Restore: load the most recent snapshots or latest “non-corrupted” >> ones and replay missing data imports from other data source. >> >> >> >> >> >> >> We would like to know if somebody are using Cassandra’s backup feature in >> production and could share your experience with us. >> >> >> >> >> >> >> Your help would be greatly appreciated. >> >> >> >> Best regards, >> >> >> >> Minh >> >> >> >> >> This message and any attachments (the "message") is >> intended solely for the intended addressees and is confidential. >> If you receive this message in error,or are not the intended recipient(s), >> please delete it and any copies from your systems and immediately notify >> the sender. Any unauthorized view, use that does not comply with its >> purpose, >> dissemination or disclosure, either whole or partial, is prohibited. Since >> the internet >> cannot guarantee the integrity of this message which may not be reliable, >> BNP PARIBAS >> (and its subsidiaries) shall not be liable for the message if modified, >> changed or falsified. >> Do not print this message unless it is necessary,consider the environment. >> >> ---------------------------------------------------------------------------------------------------------------------------------- >> >> Ce message et toutes les pieces jointes (ci-apres le "message") >> sont etablis a l'intention exclusive de ses destinataires et sont >> confidentiels. >> Si vous recevez ce message par erreur ou s'il ne vous est pas destine, >> merci de le detruire ainsi que toute copie de votre systeme et d'en avertir >> immediatement l'expediteur. Toute lecture non autorisee, toute utilisation >> de >> ce message qui n'est pas conforme a sa destination, toute diffusion ou toute >> publication, totale ou partielle, est interdite. L'Internet ne permettant >> pas d'assurer >> l'integrite de ce message electronique susceptible d'alteration, BNP Paribas >> (et ses filiales) decline(nt) toute responsabilite au titre de ce message >> dans l'hypothese >> ou il aurait ete modifie, deforme ou falsifie. >> N'imprimez ce message que si necessaire, pensez a l'environnement. >> >>