Thats correct. With the fully async option the checkpoints take longer but
you don't impact ongoing processing of elements. With the semi-async method
snapshots take a shorter time but during the synchronous part no element
processing can happen.

On Fri, 20 May 2016 at 15:04 Abhishek Singh <abhis...@tetrationanalytics.com>
wrote:

> Yes. Thanks for explaining.
>
> On Friday, May 20, 2016, Ufuk Celebi <u...@apache.org> wrote:
>
>> On Thu, May 19, 2016 at 8:54 PM, Abhishek R. Singh
>> <abhis...@tetrationanalytics.com> wrote:
>> > If you can take atomic in-memory copies, then it works (at the cost of
>> > doubling your instantaneous memory). For larger state (say rocks DB),
>> won’t
>> > you have to stop the world (atomic snapshot) and make a copy? Doesn’t
>> that
>> > make it synchronous, instead of background/async?
>>
>> Hey Abhishek,
>>
>> that's correct. There are two variants for RocksDB:
>>
>> - semi-async (default): snapshot is taking via RocksDB backup feature
>> to backup to a directory (sync). This is then copied to the final
>> checkpoint location (async, e.g copy to HDFS).
>>
>> - fully-async: snapshot is taking via RocksDB snapshot feature (sync,
>> but no full copy and essentially "free"). With this snapshot we
>> iterate over all k/v-pairs and copy them to the final checkpoint
>> location (async, e.g. copy to HDFS).
>>
>> You enable the second variant via:
>> rocksDbBackend.enableFullyAsyncSnapshots();
>>
>> This is only part of the 1.1-SNAPSHOT version though.
>>
>> I'm not too familiar with the performance characteristics of both
>> variants, but maybe Aljoscha can chime in.
>>
>> Does this clarify things for you?
>>
>> – Ufuk
>>
>

Reply via email to