Implications on incremental checkpoint duration based on RocksDBStateBackend, Number of Slots per TM

Sameer W Wed, 17 Jun 2020 12:35:16 -0700

Hi,

The number of RocksDB databases the Flink creates is equal to the number of
operator states multiplied by the number of slots.


Assuming a parallelism of 100 for a job which is executed on 100 TM's with
1 slot per TM as opposed to 10 TM's with 10 slots per TM I have noticed
that the former configuration is more efficient for incremental
checkpointing. In both cases the number of RocksDB databases is the same,
except in the latter case 10 times as many are created in one TM vs the
former case.

Reading the link
<https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html>
below, it says - "link uses this to figure out the state changes. To do
this, Flink triggers a flush in RocksDB, forcing all memtables into
sstables on disk, and hard-linked in a local temporary directory. *This
process is synchronous to the processing pipeline*, and Flink performs all
further steps asynchronously and does not block processing."

What does "Synchronous to the processing pipeline" mean? Does it mean that
flushing to DB happens synchronously (serially) for all RocksDB databases
in one TM? Is the flushing single threaded per TM

Thanks,
Sameer
<https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html>

Implications on incremental checkpoint duration based on RocksDBStateBackend, Number of Slots per TM

Reply via email to