I have a stateful operator in a task which processes thousands of elements
per second (per task) when using the Filesystem backend.  As documented and
reported by other users, when I switch to the RocksDB backend, throughput
is considerably lower.  I need something that combines the high performance
of in-memory with the large state and incremental checkpointing of RocksDB.

For my application, I can *almost* accomplish this by including a caching
layer which maintains a map of pending (key, value) writes.  These are
periodically flushed to the RocksDB (and always prior to a checkpoint).
This greatly reduces the number of writes to RocksDB, and means that I can
get a "best of both worlds" in terms of throughput and
reliability/incremental checkpointing.  These optimizations make sense for
my workload, since events which operate on the same key tend to be close
together in the stream.  (Over the long haul, there are many millions of
keys, and occasionally, an event references some key from the distant past,
hence the need for RocksDB.)

Unfortunately, this solution does not work perfectly because when I do
eventually flush writes to the underlying RocksDB backend, the stateful
processor may be operating on an element which belongs to a different key
group.  Hence, the elements that I flush are associated with the wrong key
group, and things don't work quite right.

*Is there any way to wrap the RocksDB backend with caching and deferred
writes (in a way which is "key-group aware")?*

Thanks!

David

Reply via email to