Thanks Hossein, Just one more question is there any special SOP or consideration we have to take for multi-site backup.
Please share any helpful link, blog or steps documented. Regards, Adarsh Kumar On Sun, Dec 1, 2019 at 10:40 PM Hossein Ghiyasi Mehr <ghiyasim...@gmail.com> wrote: > 1. It's recommended to use commit log after one node failure. Cassandra > has many options such as replication factor as substitute solution. > 2. Yes, right. > > *VafaTech.com - A Total Solution for Data Gathering & Analysis* > > > On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar <adarsh0...@gmail.com> wrote: > >> Thanks Ahu and Hussein, >> >> So my understanding is: >> >> 1. Commit log backup is not documented for Apache Cassandra, hence >> not standard. But can be used for restore on the same machine (For taking >> backup from commit_log_dir). If used on other machine(s) has to be in the >> same topology. Can it be used for replacement node? >> 2. For periodic backup Snapshot+Incremental backup is the best option >> >> >> Thanks, >> Adarsh Kumar >> >> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell <cclive1...@gmail.com> wrote: >> >>> Hossein is right , But for use , we restore to the same cassandra >>> topology ,So it is usable to do replay .But when restore to the >>> same machine it is also usable . >>> Using sstableloader cost too much time and more storage(though will >>> reduce after restored) >>> >>> Hossein Ghiyasi Mehr <ghiyasim...@gmail.com> 于2019年11月28日周四 下午7:40写道: >>> >>>> commitlog backup isn't usable in another machine. >>>> Backup solution depends on what you want to do: periodic backup or >>>> backup to restore on other machine? >>>> Periodic backup is combine of snapshot and incremental backup. Remove >>>> incremental backup after new snapshot. >>>> Take backup to restore on other machine: You can use snapshot after >>>> flushing memtable or Use sstableloader. >>>> >>>> >>>> ---- >>>> VafaTech.com - A Total Solution for Data Gathering & Analysis >>>> >>>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell <cclive1...@gmail.com> >>>> wrote: >>>> >>>>> for cassandra or datastax's documentation, commitlog's backup is not >>>>> mentioned. >>>>> only snapshot and incremental backup is described to do backup . >>>>> >>>>> Though commitlog's archive for keyspace/table is not support but >>>>> commitlog' replay (though you must put log to commitlog_dir and restart >>>>> the >>>>> process) >>>>> support the feature of keyspace/table' replay filter (using >>>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format >>>>> to >>>>> replay the specified keyspace/table) >>>>> >>>>> Snapshot do affect the storage, for us we got snapshot one week a time >>>>> under the low business peak and making snapshot got throttle ,for you you >>>>> may >>>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019) >>>>> >>>>> >>>>> >>>>> Adarsh Kumar <adarsh0...@gmail.com> 于2019年11月28日周四 上午1:00写道: >>>>> >>>>>> Thanks Guo and Eric for replying, >>>>>> >>>>>> I have some confusions about commit log backup: >>>>>> >>>>>> 1. commit log archival technique is ( >>>>>> >>>>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore- >>>>>> ) as good as an incremental backup, as it also captures commit logs >>>>>> after >>>>>> memtable flush. >>>>>> 2. If we go for "Snapshot + Incremental bk + Commit log", here we >>>>>> have to take commit log from commit log directory (is there any SOP >>>>>> for >>>>>> this?). As commit logs are not per table or ks, we will have chalange >>>>>> in >>>>>> restoring selective tables. >>>>>> 3. Snapshot based backups are easy to manage and operate due to >>>>>> its simplicity. But they are heavy on storage. Any views on this? >>>>>> 4. Please share any successful strategy that someone is using for >>>>>> production. We are still in the design phase and want to implement >>>>>> the best >>>>>> solution. >>>>>> >>>>>> Thanks Eric for sharing link for medusa. >>>>>> >>>>>> Regards, >>>>>> Adarsh Kumar >>>>>> >>>>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> For me, I think the last one : >>>>>>> Snapshot + Incremental + commitlog >>>>>>> is the most meaningful way to do backup and restore, when you make >>>>>>> the data backup to some where else like AWS S3. >>>>>>> >>>>>>> - Snapshot based backup // for incremental data will not be >>>>>>> backuped and may lose data when restore to the time latter than >>>>>>> snapshot >>>>>>> time; >>>>>>> - Incremental backups // better than snapshot backup .but >>>>>>> with Insufficient data accuracy. For data remain in the memtable >>>>>>> will be >>>>>>> lose; >>>>>>> - Snapshot + incremental >>>>>>> - Snapshot + commitlog archival // better data precision than >>>>>>> made incremental backup, but the data in the non archived >>>>>>> commitlog(not >>>>>>> archive and commitlog log not closed) will not restore and will >>>>>>> lose. Also >>>>>>> when log is too much, do log reply will cost very mucu time >>>>>>> >>>>>>> For me ,We use snapshot + incremental + commitlog archive. We read >>>>>>> snapshot data and incremental data .Also the log is backuped .But we >>>>>>> will >>>>>>> not backup the >>>>>>> log whose data have been flush to sstable ,for the data will be >>>>>>> backuped by the way we do incremental backup . >>>>>>> >>>>>>> This way , the data will exist in the format of sstable trough >>>>>>> snapshot backup and incremental backup . The log number will be very >>>>>>> small >>>>>>> .And log replay will not cost much time. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Eric LELEU <e...@strapdata.com> 于2019年11月27日周三 下午4:13写道: >>>>>>> >>>>>>>> Hi, >>>>>>>> TheLastPickle & Spotify have released Medusa as Cassandra Backup >>>>>>>> tool. >>>>>>>> >>>>>>>> See : >>>>>>>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html >>>>>>>> >>>>>>>> Hope this link will help you. >>>>>>>> >>>>>>>> Eric >>>>>>>> >>>>>>>> >>>>>>>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit : >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I was looking for the backup strategies of Cassandra. After some >>>>>>>> study I came to know that there are the following options: >>>>>>>> >>>>>>>> - Snapshot based backup >>>>>>>> - Incremental backups >>>>>>>> - Snapshot + incremental >>>>>>>> - Snapshot + commitlog archival >>>>>>>> - Snapshot + Incremental + commitlog >>>>>>>> >>>>>>>> Which is the most suitable and feasible approach? Also which of >>>>>>>> these is used most. >>>>>>>> Please let me know if there is any other option to tool available. >>>>>>>> >>>>>>>> Thanks in advance. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Adarsh Kumar >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> you are the apple of my eye ! >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> you are the apple of my eye ! >>>>> >>>> >>> >>> -- >>> you are the apple of my eye ! >>> >>