Thanks Ahu and Hussein,

So my understanding is:

   1. Commit log backup is not documented for Apache Cassandra, hence not
   standard. But can be used for restore on the same machine (For taking
   backup from commit_log_dir). If used on other machine(s) has to be in the
   same topology. Can it be used for replacement node?
   2. For periodic backup Snapshot+Incremental backup is the best option


Thanks,
Adarsh Kumar

On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell <cclive1...@gmail.com> wrote:

> Hossein is right , But for use , we restore to the same cassandra topology
> ,So it is usable to do replay .But when restore to the
> same machine it is also usable .
> Using sstableloader cost too much time and more storage(though will reduce
> after  restored)
>
> Hossein Ghiyasi Mehr <ghiyasim...@gmail.com> 于2019年11月28日周四 下午7:40写道:
>
>> commitlog backup isn't usable in another machine.
>> Backup solution depends on what you want to do: periodic backup or backup
>> to restore on other machine?
>> Periodic backup is combine of snapshot and incremental backup. Remove
>> incremental backup after new snapshot.
>> Take backup to restore on other machine: You can use snapshot after
>> flushing memtable or Use sstableloader.
>>
>>
>> ----
>> VafaTech.com - A Total Solution for Data Gathering & Analysis
>>
>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell <cclive1...@gmail.com> wrote:
>>
>>> for cassandra or datastax's documentation, commitlog's backup is not
>>> mentioned.
>>> only snapshot and incremental backup is described to do backup .
>>>
>>> Though commitlog's archive for keyspace/table is not support but
>>> commitlog' replay (though you must put log to commitlog_dir and restart the
>>> process)
>>> support the feature of keyspace/table' replay filter (using
>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format to
>>> replay the specified keyspace/table)
>>>
>>> Snapshot do affect the storage, for us we got snapshot one week a time
>>> under the low business peak and making snapshot got throttle ,for you you
>>> may
>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>>
>>>
>>>
>>> Adarsh Kumar <adarsh0...@gmail.com> 于2019年11月28日周四 上午1:00写道:
>>>
>>>> Thanks Guo and Eric for replying,
>>>>
>>>> I have some confusions about commit log backup:
>>>>
>>>>    1. commit log archival technique is (
>>>>    
>>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>>    ) as good as an incremental backup, as it also captures commit logs 
>>>> after
>>>>    memtable flush.
>>>>    2. If we go for "Snapshot + Incremental bk + Commit log", here we
>>>>    have to take commit log from commit log directory (is there any SOP for
>>>>    this?). As commit logs are not per table or ks, we will have chalange in
>>>>    restoring selective tables.
>>>>    3. Snapshot based backups are easy to manage and operate due to its
>>>>    simplicity. But they are heavy on storage. Any views on this?
>>>>    4. Please share any successful strategy that someone is using for
>>>>    production. We are still in the design phase and want to implement the 
>>>> best
>>>>    solution.
>>>>
>>>> Thanks Eric for sharing link for medusa.
>>>>
>>>> Regards,
>>>> Adarsh Kumar
>>>>
>>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1...@gmail.com>
>>>> wrote:
>>>>
>>>>> For me, I think the last one :
>>>>>  Snapshot + Incremental + commitlog
>>>>> is the most meaningful way to do backup and restore, when you make the
>>>>> data backup to some where else like AWS S3.
>>>>>
>>>>>    - Snapshot based backup // for incremental data will not be
>>>>>    backuped and may lose data when restore to the time latter than 
>>>>> snapshot
>>>>>    time;
>>>>>    - Incremental backups // better than snapshot backup .but
>>>>>    with Insufficient data accuracy. For data remain in the memtable will 
>>>>> be
>>>>>    lose;
>>>>>    - Snapshot + incremental
>>>>>    - Snapshot + commitlog archival // better data precision than made
>>>>>    incremental backup, but the data in the non archived commitlog(not 
>>>>> archive
>>>>>    and commitlog log not closed) will not restore and will lose. Also 
>>>>> when log
>>>>>    is too much, do log reply will cost very mucu time
>>>>>
>>>>> For me ,We use snapshot + incremental + commitlog archive. We read
>>>>> snapshot data and incremental data .Also the log is backuped .But we will
>>>>> not backup the
>>>>> log whose data have been flush to sstable ,for the data will be
>>>>> backuped by the way we do incremental backup .
>>>>>
>>>>> This way , the data will exist in the format of sstable trough
>>>>> snapshot backup and incremental backup . The log number will be very small
>>>>> .And log replay will not cost much time.
>>>>>
>>>>>
>>>>>
>>>>> Eric LELEU <e...@strapdata.com> 于2019年11月27日周三 下午4:13写道:
>>>>>
>>>>>> Hi,
>>>>>> TheLastPickle & Spotify have released Medusa as Cassandra Backup tool.
>>>>>>
>>>>>> See :
>>>>>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html
>>>>>>
>>>>>> Hope this link will help you.
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit :
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I was looking for the backup strategies of Cassandra. After some
>>>>>> study I came to know that there are the following options:
>>>>>>
>>>>>>    - Snapshot based backup
>>>>>>    - Incremental backups
>>>>>>    - Snapshot + incremental
>>>>>>    - Snapshot + commitlog archival
>>>>>>    - Snapshot + Incremental + commitlog
>>>>>>
>>>>>> Which is the most suitable and feasible approach? Also which of these
>>>>>> is used most.
>>>>>> Please let me know if there is any other option to tool available.
>>>>>>
>>>>>> Thanks in advance.
>>>>>>
>>>>>> Regards,
>>>>>> Adarsh Kumar
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> you are the apple of my eye !
>>>>>
>>>>
>>>
>>> --
>>> you are the apple of my eye !
>>>
>>
>
> --
> you are the apple of my eye !
>

Reply via email to