Thats correct. With the fully async option the checkpoints take longer but you don't impact ongoing processing of elements. With the semi-async method snapshots take a shorter time but during the synchronous part no element processing can happen.
On Fri, 20 May 2016 at 15:04 Abhishek Singh <abhis...@tetrationanalytics.com> wrote: > Yes. Thanks for explaining. > > On Friday, May 20, 2016, Ufuk Celebi <u...@apache.org> wrote: > >> On Thu, May 19, 2016 at 8:54 PM, Abhishek R. Singh >> <abhis...@tetrationanalytics.com> wrote: >> > If you can take atomic in-memory copies, then it works (at the cost of >> > doubling your instantaneous memory). For larger state (say rocks DB), >> won’t >> > you have to stop the world (atomic snapshot) and make a copy? Doesn’t >> that >> > make it synchronous, instead of background/async? >> >> Hey Abhishek, >> >> that's correct. There are two variants for RocksDB: >> >> - semi-async (default): snapshot is taking via RocksDB backup feature >> to backup to a directory (sync). This is then copied to the final >> checkpoint location (async, e.g copy to HDFS). >> >> - fully-async: snapshot is taking via RocksDB snapshot feature (sync, >> but no full copy and essentially "free"). With this snapshot we >> iterate over all k/v-pairs and copy them to the final checkpoint >> location (async, e.g. copy to HDFS). >> >> You enable the second variant via: >> rocksDbBackend.enableFullyAsyncSnapshots(); >> >> This is only part of the 1.1-SNAPSHOT version though. >> >> I'm not too familiar with the performance characteristics of both >> variants, but maybe Aljoscha can chime in. >> >> Does this clarify things for you? >> >> – Ufuk >> >