for cassandra or datastax's documentation, commitlog's backup is not mentioned. only snapshot and incremental backup is described to do backup .
Though commitlog's archive for keyspace/table is not support but commitlog' replay (though you must put log to commitlog_dir and restart the process) support the feature of keyspace/table' replay filter (using -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format to replay the specified keyspace/table) Snapshot do affect the storage, for us we got snapshot one week a time under the low business peak and making snapshot got throttle ,for you you may see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019) Adarsh Kumar <adarsh0...@gmail.com> 于2019年11月28日周四 上午1:00写道: > Thanks Guo and Eric for replying, > > I have some confusions about commit log backup: > > 1. commit log archival technique is ( > > https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore- > ) as good as an incremental backup, as it also captures commit logs after > memtable flush. > 2. If we go for "Snapshot + Incremental bk + Commit log", here we have > to take commit log from commit log directory (is there any SOP for this?). > As commit logs are not per table or ks, we will have chalange in restoring > selective tables. > 3. Snapshot based backups are easy to manage and operate due to its > simplicity. But they are heavy on storage. Any views on this? > 4. Please share any successful strategy that someone is using for > production. We are still in the design phase and want to implement the best > solution. > > Thanks Eric for sharing link for medusa. > > Regards, > Adarsh Kumar > > On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1...@gmail.com> wrote: > >> For me, I think the last one : >> Snapshot + Incremental + commitlog >> is the most meaningful way to do backup and restore, when you make the >> data backup to some where else like AWS S3. >> >> - Snapshot based backup // for incremental data will not be backuped >> and may lose data when restore to the time latter than snapshot time; >> - Incremental backups // better than snapshot backup .but >> with Insufficient data accuracy. For data remain in the memtable will be >> lose; >> - Snapshot + incremental >> - Snapshot + commitlog archival // better data precision than made >> incremental backup, but the data in the non archived commitlog(not archive >> and commitlog log not closed) will not restore and will lose. Also when >> log >> is too much, do log reply will cost very mucu time >> >> For me ,We use snapshot + incremental + commitlog archive. We read >> snapshot data and incremental data .Also the log is backuped .But we will >> not backup the >> log whose data have been flush to sstable ,for the data will be backuped >> by the way we do incremental backup . >> >> This way , the data will exist in the format of sstable trough snapshot >> backup and incremental backup . The log number will be very small .And log >> replay will not cost much time. >> >> >> >> Eric LELEU <e...@strapdata.com> 于2019年11月27日周三 下午4:13写道: >> >>> Hi, >>> TheLastPickle & Spotify have released Medusa as Cassandra Backup tool. >>> >>> See : >>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html >>> >>> Hope this link will help you. >>> >>> Eric >>> >>> >>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit : >>> >>> Hi, >>> >>> I was looking for the backup strategies of Cassandra. After some study I >>> came to know that there are the following options: >>> >>> - Snapshot based backup >>> - Incremental backups >>> - Snapshot + incremental >>> - Snapshot + commitlog archival >>> - Snapshot + Incremental + commitlog >>> >>> Which is the most suitable and feasible approach? Also which of these is >>> used most. >>> Please let me know if there is any other option to tool available. >>> >>> Thanks in advance. >>> >>> Regards, >>> Adarsh Kumar >>> >>> >> >> -- >> you are the apple of my eye ! >> > -- you are the apple of my eye !