Hi,

I’m forwarding this question to Stefan (cc’ed).
He would most likely be able to answer your question, as he has done 
substantial work in the RocksDB state backends.

Cheers,
Gordon


On 24 October 2018 at 8:47:24 PM, chandan prakash (chandanbaran...@gmail.com) 
wrote:

Hi,
I am new to Flink.
Was looking into the code to understand how Flink does FullSnapshot and 
Incremental Snapshot using RocksDB

What I understood:
1. For full snapshot, we call RocksDb snapshot api which basically an iterator 
handle to the entries in RocksDB instance. We iterate over every entry one by 
one and serialize that to some distributed file system. 
Similarly in restore for fullSnapshot, we read the file to get every entry and 
apply that to the rocksDb instance one by one to fully construct the db 
instance.

2. On the other hand in for Incremental Snapshot, we rely on RocksDB Checkpoint 
api to copy the sst files to HDFS/S3 incrementally.
Similarly on restore, we copy the sst files to local directory and instantiate 
rocksDB instance with the path of the directory.

My Question is:
1. Why did we took 2 different approaches using different RocksDB apis ?
We could have used Checkpoint api of RocksDB for fullSnapshot as well .
2. Is there any specific reason to use Snapshot API of rocksDB  over Checkpoint 
api of RocksDB for fullSnapshot?

I am sure, I am missing some important point, really curious to know that.
Any explanation will be really great. Thanks in advance.


Regards,
Chandan





--
Chandan Prakash

Reply via email to